SPARQL: combine and exclude regex filters

I want to filter my SPARQL query for specific keywords while at the same time excluding other keywords. I thought this may be easily accomplished with FILTER (regex(str(?var),"includedKeyword","i") && !regex(str(?var),"excludedKeyword","i")). It works without the "!" condition, but not with. I also separated the FILTER statements, but no use.

I used this query on http://europeana.ontotext.com/ :

PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX edm: <http://www.europeana.eu/schemas/edm/>
PREFIX ore: <http://www.openarchives.org/ore/terms/>

        SELECT DISTINCT ?CHO
        WHERE {
          ?proxy dc:subject ?subject .
          FILTER ( regex(str(?subject),"gemälde","i") && !regex(str(?subject),"Fotografie","i") )
          ?proxy edm:type "IMAGE" .
          ?proxy ore:proxyFor ?CHO.
          ?agg edm:aggregatedCHO ?CHO; edm:country "germany".
        }

But I always get the result on the first row with the title "Gemäldegalerie", which has a dc:subject of "Fotografie" (the one I want excluded). I think the problem lies in the fact that one object from the Europeana database can have more than one dc:subject property, so maybe it looks only for one of these properties while ignoring the other ones.

Any ideas? Would be very thankful!

Answers


The problem is that your combined filter checks for the same binding of ?subject. So it succeeds if at least one value of ?subject matches both conditions (which is almost always true, because the string "Gemäldegalerie", for example, matches your first regex and does not match the second).

So for the negative condition, you need to formulate something that checks for all possible values, rather than just one particular value. You can do this using SPARQL's NOT EXISTS function, for example like this:

  PREFIX dc: <http://purl.org/dc/elements/1.1/>
  PREFIX edm: <http://www.europeana.eu/schemas/edm/>
  PREFIX ore: <http://www.openarchives.org/ore/terms/>

  SELECT DISTINCT ?CHO
  WHERE {
      ?proxy edm:type "IMAGE" .
      ?proxy ore:proxyFor ?CHO.
      ?agg edm:aggregatedCHO ?CHO; edm:country "germany".
      ?proxy dc:subject ?subject . 
      FILTER(regex(str(?subject),"gemälde","i")) 
      FILTER NOT EXISTS { 
            ?proxy dc:subject ?otherSubject. 
            FILTER(regex(str(?otherSubject),"Fotografie","i")) 
      }
    }

As an aside: since you are doing regular expression checks, and now combining them with an NOT EXISTS operator, this is likely to become very expensive for the query processor quite quickly. You may want to think about alternative ways to formulate your query (for example, using the exact subject string to include or exclude to eliminate the regex), or even having a look at some non-standard extensions that the SPARQL endpoint might provide (OWLIM, for example, the store on which the Europeana endpoint runs, supports various full-text-search extensions, though I am not sure they are enabled in the Europeana endpoint).


Need Your Help

Override global element on XSD

xml xsd overriding

Have an XML that I need to validate using XSD, the XSD includes a global reference to an element using "dateTime" type, the element on my XML will have a "date" type instead, so I was hoping I could

UIViewAnimationOptions error migrating from Swift 1 to Swift 2

swift uiview

I just updated Xcode to 7.0.1 and therefore also Swift from Swift 1 to