FAQ (Frequently asked questions)
- The same original article appears many times in the results. Is this an error?
No, usually this is not an error.
An Outbreak Article corresponds to exactly one outbreak.
Hence, if an original article (from PubMed, e.g.) describes more than one outbreak,
many Outbreak Articles referring to that original article are needed.
Also note that two different Outbreak Articles do not necessarily refer to different outbreaks.
If two different original articles describe the same outbreak, two different Outbreak Articles
are used.
This is probably a rare case, but you should keep that in mind, especially when doing
statistics.
You can think of an Outbreak Article as corresponding to one outbreak description in the
literature.
Having said that, there might be a few true duplicate Outbreak Articles in the database. We are working on their removal.
- I cannot open the PDF files! What can I do?
Your software for reading PDF files might be to old. Please update it to the newest version!
(E.g., you can use the newest version of
Adobe Reader.)
As an alternative, you can use the HTML view that is accessible in the same results column
as the PDF view.
- Outbreak Database was helpful when I wrote my paper. Can I acknowledge you in my paper?
Yes, we would be pleased to be acknowledged in your paper. Please acknowledge us as:
Institute for Hygiene and Environmental Medicine, Charité – University Medicine Berlin, Germany
and reference us as:
Behnke, M.; Weitzel-Kage, D.; Hansen, S.; Eckstein, M.; Stolzenhain, T.; Gastmeier, P.
www.outbreak-database.com
Last retrieved in: <month> <year> (e.g., January 09)
If you do not reference us in your primary text, please reference us in the acknowledgement.
If you would like your article to be listed on the links page, please contact us.
How to use the search function in Outbreak Database
- Quick Start
- Terms
- Fields
- Term Modifiers
- Boolean Operators
- Grouping
- Field Grouping
- Escaping Special Characters
Quick Start
To be brief, you can enter and combine your search terms pretty much like you are probably used to from
using web search engines like Google.
E.g.:
mrsa finds all articles containing mrsa or MRSA etc.
mrsa dirt finds all articles containing mrsa and dirt.
"hepatitis c" finds all articles containing the phrase hepatitis c, i.e. term hepatitis immediately followed by term c.
If you prefix a term with a tag followed by a colon, it will
only be searched in the field concerning the specific tag, otherwise, it will be searched in all fields.
The page Field Reference
describes the structure of an Outbreak article and the tags that are usable for any query.
(The following documentation is based on the documentation of Apache Lucene.)
Terms
A query is broken up into terms and operators. There are two types of terms: Single Terms and Phrases.
A Single Term is a single word such as virus or mrsa.
A Phrase is a group of words surrounded by double quotes such as "staphylococcus aureus".
Multiple terms can be combined together with Boolean operators to form a more complex query (see below).
The search terms are case-insensitive, i. e. Virus yields the same result as virus.
Fields
If you prefix a term with a tag followed by a colon, the term will only be searched in the field belonging to that tag, otherwise, it will be searched in all fields.
Let's see some examples. If you want to find an article about an outbreak in the USA containing the phrase "they washed their hands" somewhere in the article, you can enter:
cy:usa AND "they washed their hands"
or
cy:usa "they washed their hands"
Note: The field is only valid for the term that it directly precedes, so the query
cy:United Arab Emirates
will only find United in the country field. It will find Arab and Emirates just somewhere in the text.
Simply type:
cy:"United Arab Emirates"
Note: The characters in a tag must either be all UPPERCASE or all lowercase.
Term Modifiers
Wildcard Searches
Single and multiple character wildcard searches are supported.
To perform a single character wildcard search use the "?" symbol.
To perform a multiple character wildcard search use the "*" symbol.
The single character wildcard search looks for terms that match that with the single
character replaced. For example, to search for "text" or "test" you can use the
search:
te?t
Multiple character wildcard searches looks for 0 or more characters. For example,
to search for test, tests or tester, you can use the search:
test*
You can also use the wildcard searches in the middle of a term:
te*t
Note: You cannot use a * or ? symbol as the first character of a search.
Fuzzy Searches
To do a fuzzy search use the tilde, "~", symbol at the end of a Single word Term.
For example to search for a term similar in spelling to
staphylokokus use the fuzzy search:
staphylokokus~
This search will also find Staphylococcus.
An additional (optional) parameter can specify the required similarity. The value
is between 0 and 1, with a value closer to 1 only terms with a higher similarity
will be matched. For example:
staphylokokus~0.8
The default that is used if the parameter is not given is 0.5.
Proximity Searches
Finding words that are within a specific distance is also supported. To do a proximity
search use the tilde, "~", symbol at the end of a Phrase. For example to search
for a dirt and
infection within 10 words of each other in a document use the search:
"dirt infection"~10
Note that in rare cases, proximity search might return strange results! We are working on the issue.
Boosting a Term
To boost a term use the caret, "^", symbol with a boost factor (a number) at the
end of the term you are searching. The higher the boost factor, the more relevant
the term will be for your search result.
Boosting allows you to control the relevance of a document by boosting its term.
For example, if you are searching for
dirt infection
and you want the term "dirt" to be more
relevant boost it using the ^ symbol along with the boost factor next to the term.
You would type:
dirt^4 infection
This will make documents with the term "dirt"
appear more relevant. You can also boost Phrase Terms as in the example:
"dirt and dust"^4 "infection and disease"
By default, the boost factor is 1. Although the boost factor must be positive, it
can be less than 1 (e.g. 0.2).
Boolean Operators
Boolean operators allow terms to be combined through logic operators. Lucene supports AND, "+", OR, NOT and "-" as Boolean operators .
(Note: Boolean operators must be all UPPERCASE).
AND
The AND operator is the default conjunction operator. This means that if there is
no Boolean operator between two terms, the AND operator is used.
The AND operator matches documents where both terms exist in a single article.
To search for documents that contain "dirt and dust" and "infection and disease"
use the query:
"dirt and dust" AND "infection and disease"
or
"dirt and dust" "infection and disease"
OR
The OR operator links two terms and finds a matching article if either of the terms
exists in the article.
To search for documents that contain either "dirt and dust" or just "infection and
disease" or both use the query:
"dirt and dust" OR "infection and disease"
NOT
The NOT operator excludes documents that contain the term after NOT.
To search for documents that contain "dirt and dust" but not "infection and disease"
use the query:
"dirt and dust" NOT "infection and disease"
Note: The NOT operator does not exacty behave like a logical NOT. E. g., it cannot
be used with just one term. For example, the following search will return no results:
NOT "foobar 123"
The following query will not return all documents:
mrsa OR (NOT "foobar 123")
Grouping
You can use parentheses to group clauses to form sub queries. This can be very useful if you want to control the boolean logic for a query.
To search for either dirt or dust and infection use the query:
(dirt OR dust) AND infection
This eliminates any confusion and makes sure you that infection must be contained and either term dust or dirt must be contained.
Always use parentheses when grouping boolean queries! Writing something like
dirt OR dust AND infection
might lead to unexpected results.
Field Grouping
You can use parentheses to group multiple clauses to a single field.
To search for a country that contains both the word "united" and the phrase "arab emirates" use the query:
cy:(united "arab emirates")
Escaping Special Characters
Lucene supports escaping special characters that are part of the query syntax. The current list special characters are
+ - && || ! ( ) { } [ ] ^ " ~ * ? : \
To escape these character use the \ before the character. For example to search for (1+1):2 use the query:
\(1\+1\)\:2