The Library of Congress > Linked Data Service > Technical Center: Searching/Querying

Searching

  • Search results are sorted by relevancy to your search
  • Asterisk (*) wildcards may be used to represent any number of characters.
  • Question mark (?) wildcards may be used to represent a single character.
  • Boolean AND, NOT, OR commands may be used.
  • Multiple terms are treated as an AND query
  • Enclosing terms in double quotes will treat them as a phrase, not as multiple separate terms.

The asterisk symbol may be used at the beginning, end, or middle of a term.

For example, a search for "*dog*" will return:

  • Dogfish
  • Bulldog
  • Lepdogaster
  • etc.

Search limits / Special search constraints

Search limits or special search constraints are applied by invoking a constraint, followed by a colon, and then the value to query. These can be combined with the boolean options above to perform complex searches of the system. The following constraints, with examples, are available:

Known-label retrieval

If you have a known label or heading but are unsure of its URI, it is possible to arrive at the true URI by using the label functionality provided in the LC Linked Data Service. For instance, if your label or heading is "Orchids", use this URI to obtain a HTTP 302 FOUND message with a redirection to the established URI:

Or for 'France' in the countries list:

The URI syntax for the label functionality is to use the token "label/", followed by a case-insensitive string for the search term.

  • id.loc.gov/authorities/{scheme_name}/label/{term}

or:

  • id.loc.gov/vocabulary/{vocabulary_name}/label/{term}

{scheme_name} in the first example is optional. Not including a scheme name will search everything under /authorities. Do not place a trailing slash on the known label or heading. Also, stemming, truncation, etc., is not functional, so wildcard characters such as the asterisk (*) or the percent sign (%) should not appear within the string unless they are part of the stored heading or label. The entire string must match a label or heading stored within the system.

If the label functionality does not properly match the string provided, a HTTP 404 Not Found message will be returned. If your label possesses a character outside the American Standard Code for Information Interchange (ASCII) Offsite link range of 128 characters, it is strongly advised to URL-encode Offsite link your string prior to sending the request. Browsers such as Safari and Firefox do this automatically, whereas Internet Explorer will not always do so. For example, the URL encoded representation for:

  • Bărăganul (Romania)

is:

  • B%C4%83r%C4%83ganul%20%28Romania%29

If multiple concepts with the same label are found within the database, the system will only returned the first concept it finds with that label. This is an issue for terms that exist in Children's Subjects (eg., robots). Therefore, if the optional {scheme_name} is known, it is best to use it in label searches. Hits are ordered alphabetically by the last token of the URI. For Library of Congress Subject Headings, this token is the Library of Congress Control Number, such as "sh85095334".


Suggest Services

There are two suggest services. Either one is for when you think you know a label, but may not want to type out the whole thing; or may not know if it has diacritics or other special characters.

Suggest

The suggest service searches :
  • authorized labels (case and diacritic insensitive)

This search was developed specifically for authorities where resources are disambiguated, so there is only one answer for any given string.

  • Parameters:
  • q is the query text
  • callback Helps with asynchronous functions using the service
  • rdftype is the MADSRDF node name.
  • count is the number of results to return (up to 1,000)
  • offset is the place to start from if you are paging through results

Examples:

https://id.loc.gov/authorities/subjects/suggest?q=history%20in

Note: The JSON result is not the best, but we have left it in place to avoid backward incompatibility, and designed Suggest2 to replace it.

The return is the searched term, each result found, the number of documents with each label, and then the matching uris.

        [
        "history in",
        [
        "History in advertising",
        "History in art",
        "History in art--Catalogs"        
        ],
        [
        "1 result",
        "1 result",
        "1 result"        
        ],
        [
        "http://id.loc.gov/authorities/subjects/sh2009008399",
        "http://id.loc.gov/authorities/subjects/sh85061252",
        "http://id.loc.gov/authorities/subjects/sh2009126431",        
        ]
        ]

Suggest2

This is a left anchored or keyword search that can be plugged into a dropdown search application, returning results with each keystroke. Default search is left-anchored. Deprecated resources are included. Deprecated resources point to their replacement, if available.

It expects multiple hits for any given term, and does some relevance ranking of results if you choose "keyword" searching.

The suggest2 service searches :
  • authorized labels (case and diacritic insensitive)
  • variant labels (case and diacritic insensitive)
  • codes (case sensitive)
  • tokens (case sensitive)

Punctuation sensitivity on keyword searches is turned on if there is punctuation; strip it out if you want punctuation in sensitivity

Construct a query using the scheme (ie., /authorities/names/) if known, for more targeted results.If not known, it will search all resources (except classifications).

  • Parameters:
  • q is the query text
  • callback Helps with asynchronous functions using the service
  • memberOf is the collection within a scheme
  • rdftype is the MADSRDF node name
  • searchtype defaults to "leftanchored" but can be "keyword"
  • count is the number of results to return (up to 1,000)
  • offset is the place to start from if you are paging through results
  • mime defaults the serialization to json by default but can be set to "xml"

Examples:

  1. Variant Label Search:

    Gun dog (https://id.loc.gov/authorities/subjects/suggest2?q=gun%20dog )

    This search finds the variant label and tells you the authLabel:

  2. Code search example: Search for MnU in Organizations:

    https://id.loc.gov/vocabulary/organizations/suggest2?q=MnU

  3. Token search example:

    Search: https://id.loc.gov/authorities/names/suggest2/?q=n2009017423

  4. Membership in a subset example:

    Search subjects for a pattern heading:

    https://id.loc.gov/suggest2?q=history&memberOf=http://id.loc.gov/authorities/subjects/collection_PatternHeadingH1156

  5. Particular MARDSRDF resource type example:

    https://id.loc.gov/suggest2?q=Los Angeles Rams&rdftype=CorporateName

Sorting and Ranking

Left anchored searches are ordered alphabetically, case and diacritic insensitive.

Keyword searches are in descending relevance order, using the same search ranking as the main search page.

Note on Results

If you don't find something in a particular vocabulary but you suspect it should be there, use a broader search by dropping the subdirectories in the search:

https://id.loc.gov/authorities/subjects/suggest2/?q=ohio&count=25&rdftype=HierarchicalGeographic

https://id.loc.gov/authorities/suggest2/?q=ohio&count=25&rdftype=HierarchicalGeographic

https://id.loc.gov/suggest2/?q=ohio&count=25&rdftype=HierarchicalGeographic


User-Agent Header

The Authorities and Vocabularies service requires all clients to include a User-Agent Offsite link header for every request. If you are using a browser, the browser will do this automatically. But if you are using a command line tool or code library, you need to explicitly include this header as part of your request.

If you are using a command line tool or code library, we kindly ask that you customize the header when using the ID service to either identify yourself or identify the tool using our service. It helps us better understand who is using the service and which applications are designed to interact with ID automatically.

OpenSearch support

The Authorities and Vocabularies service supports OpenSearch Offsite link for querying, responses, and autodiscovery the Library of Congress Subjects Headings. OpenSearch is typically supported within most recent browsers such as Firefox 2 and 3, Internet Explorer 7 and 8, etc. When the browser discovers the OpenSearch functionality for this service, the site can be queried directly from the browser's included search bar without having to actually visit our web site.

Querying

No tool for querying the backend RDF is provided in this release. If you need to perform custom queries for more detailed analysis of our data, please download the bulk metadata: RDF/XML or N-Triples. Once downloaded, it's possible to use the data in any number of SPARQL-aware Offsite link engines, such as RDF4J Offsite link.

If you are querying against the RDF/XML serialization of the data, XSPARQL Offsite link is a technology that combines aspects of the XQuery Offsite link language and SPARQL into the same syntax. Similarly, XSLT+SPARQL Offsite link from Diego Berrueta allows SPARQL interaction via XSLT Offsite link.