Querying District Cases
Creating District Queries for the Lex Machina API
Overview
The Lex Machina data set is a highly curated and normalized set of federal district cases. As such, there are many attributes and participants on which you can query it. This flexibility brings with it some complexity. In this article, we will unravel this complexity, and enable you to enrich your applications through Lex Machina’s API.
The query endpoint is /query-district-cases and this is a POST request. The body of the POST will be a JSON structure of a form similar to this:
{
"caseStatus": "Terminated",
"caseTypes": {
"include": ["Contracts"]
},
"dates": {
"filed": {"onOrBefore": "2010-01-01"},
"terminated": {"onOrAfter": "2015-01-01"}
},
"damages":
[{"minimumAmount": 1000000}],
"page": 1,
"pageSize": 50
}
There are many operators and parameters that can be used in the query and much of this article will be discussing those. Here is some top-level description of the query mechanism:
- In any given query, all the attributes are implicitly ANDed together. In the example above, the cases returned will be of status “Terminated” AND of type “Contracts”, etc.
- Some of the queries have an “include” or “exclude” type operator. It is possible to have both in most attributes or data query, in which case the return value will match both criteria. If you use the same data in both the include and exclude, you will obviously get an empty set return.
- The dates queries have “onOrBefore” and “onOrAfter” operators. In this case, if you used the same date in both, you are querying for exactly that date only.
- The result set defaults to 5 returned cases and via parameters can be increased to at most 100 returned cases. It is possible to page through larger data sets, which will be discussed later.
- Attempting to query the endpoint with a syntactically erroneous query payload will result in an error. It is possible in some cases to construct a syntactically valid query but with bad data, such as non-existent references to judges, parties or other participants. This will result in empty results but will not be an error.
- The results of this query will be Case IDs for the district cases. In order to get a full picture of these cases, they will need to be looked up individually with the /district-cases/{case_id} endpoint.
Case Attributes
The first type of query we will discuss can be thought of as attributes of the case. This is metadata related directly to this case. In some cases it is just a matter of fact, such as which court the case was filed with and in some it is a result of the Lex Machina curation, such as the tags and types.
Case Status
The valid options for this key are “Open” and “Terminated”. At this point, these are the only two valid queries for this field.
{
"caseStatus":"Open"
}
Case Types
The list of types in our system is relatively stable but does change over time. The current list of all types in our system is always available at /list-case-types . For more information about the definitions of these types, you can look at the case type documentation for the Lex Machine Web UI.
This field has an include and exclude, each of which takes an array of strings.
{
"caseTypes":{
"include": ["Antitrust", "Contracts"],
"exclude": ["Patent"]
}
}
Case Tags
This tagging is part of the Lex Machina curation process and is the result of cutting-edge NLP and skilled human interpretation of the case. The current list of all tags in our system is always available at /list-case-tags .
This field has an include and an exclude, each of which takes an array of strings.
{
"caseTags": {
"include": ["Chapter 7","Chapter 13"],
"exclude": ["IRS Summons"]
}
}
Courts
This is the court in which the case takes place. Any case will have only a single court of record, but cases can be queried across courts. The current list of courts in the system is always available at /list-courts. Accessing that list will provide the name, short name and abbreviation for each court. Any of these can be used interchangeably in queries. For example, using “U.S. District Court for the District of New Jersey”, “D.N.J.”, and “njd” are all synonymous and will return the same data.
This field has an include and an exclude, each of which takes an array of strings. Because a case can have only one court, if you have an include then the exclude is not useful.
{
"courts": {
"include": ["njd","dcd"]
}
}
{
"courts": {
"exclude": ["ord"]
}
}
Dates
There are multiple date-based aspects of cases that can be used in queries. In all valid queries, there exists a “dates” JSON key to an object that will contain sub-objects related to the type of date being queried. For all, these must be qualified with at least one of the “onOrAfter” and “onOrBefore” dates. The “dates” object must have at least one of the following four objects:
- “filed”: the date of the original filing of the court case paperwork
- “terminated”: the date on which the court case ended, regardless of how it ended
- “Trial”: the date on which a trial was commenced
- “LastDocket”: the date on which the last entry was entered for the court case.
Example Queries
To find cases that began and ended in calendar 2020 the full body of the POST would be:
{
"dates": {
"filed": {
"onOrAfter": "2020-01-01"
},
"terminated": {
"onOrBefore": "2020-12-31"
}
}
}
To find cases closed over a decade the following query could be used:
{
"dates": {
"terminated": {
"onOrBefore": "2010-01-01"
}
}
}
Participants
Another way to query a case is to query via the participants. These currently can be:
- Judges
- Magistrates
- Law Firms
- Parties
For all of these types of participants in a case, they will be queried by using their respective IDs: JudgeID, MagistrateID, LawFirmID, PartyID.
Magistrates and judges have two operators - “include” and “exclude”. Both of these take arrays of integers that represent the ID of the respective types. These can be used to look for cases with or without specific judges and magistrates involved. If multiple aspects are used, these are ANDed together for the query.
Judge and magistrate IDs can be obtained from the /search-judges/ endpoint.
For the litigants they all use a similar pattern. All have the “include” and “exclude” operators as above. These will query based on if the law firm, attorney or party in any role. In addition they have the ability to include or exclude based on specific roles: “Plaintiff”, “Defendant” or “ThirdParty”. Examples of all are below.
There are not currently endpoints for querying on litigants. The IDs must either come from existing case data or from external sources such as the main Lex Machina product site_
Example Queries
To find cases involving judge Manuel Real, use the following query:
{
"judges": {
"include": [1973]
}
}
To find cases in which Quinn Emanuel Urquhart & Sullivan was involved at all, use this query:
{
"lawFirms": {
"include": [920]
}
}
To find cases in which Quinn Emanuel Urquhart & Sullivan represented the plaintiff, use this query:
{
"lawFirms": {
"includePlaintiff": [920]
}
}
To find cases in which Quinn Emanuel Urquhart & Sullivan represented a defendant and also a third party, use this query:
{
"lawFirms": {
"includeDefendant": [920],
"includeThirdParty": [920]
}
}
Outcomes
Cases can also be queried by case events and outcomes of the case. This allows for filtering on:
- Events
- Resolutions
- Findings
- Judgment Source
- Remedies
- Damages
These will be discussed individually.
Events
Events are key litigation milestones. These include the points of judgment, contestation of dismissal and other procedural actions. The list of events in the system is available at /list-events
Using valid events from the list above, queries can be constructed with the “includeEventTypes” and “excludeEventTypes” operators.
{
"events": {
"includeEventTypes": ["Summary Judgment"],
"excludeEventTypes": ["Dismiss (Contested)"]
}
}
Resolutions
Resolutions are ways that a case can be concluded. Every case has exactly one resolution. The list of resolutions is available at /list-case-resolutions. They are defined as the pair of “summary” and “specific” values such as:
{
"summary": "Claimant Win",
"specific": "Jury"
}
Using valid resolutions from the list above, they can be queried via “include” or “exclude”. For resolutions, you can only use one of either “include” or “exclude” but not both simultaneously.
{
"resolutions": {
"include": [
{
"summary": "Claimant Win",
"specific": "Bench Trial"
},
{
"summary": "Claimant Win",
"specific": "Jury"
}
]
}
}
Findings
Findings are decisions about points of law made by the court as the case progresses. An individual case can have many of them. Cases can be queried for findings that are:
- judgmentSource with a judgment source filter
- include for sources to include
- exclude for sources to exclude
- awardedToParties with specified party ID(s)
- awardedAgainstParties with specified party ID(s)
- judgmentSource from valid list at /list-judgment-sources
- nameType from valid list at /list-damages . Note this list is an array of types with names that apply for that type, eg:
"General": [ "Other / Mixed Damage Types", "Attorneys' Fees / Costs", "Prejudgment Interest" ],
- include for names and types to include
- exclude for names and types to exclude
- date
- patentInvalidityReasons
- include a list of strings of invalidity reasons to include
The “findings” list in the query must contain objects that contain at least one of the above fields. When multiple are provided, all cases matching at least one will be found.
To find all cases with findings against Todd McFarlane this query would be used:
{
"findings": [{
"awardedAgainstParties": [196852]
}
]
}
To find cases that contain findings that occurred at trial, the following query would be used:
{
"findings": [{
"judgmentSource": ["Trial"]
}
]
}
Remedies
Remedies, in Lex Machina’s API, are types of non-monetary relief that are mandated by the court after reaching a finding. Cases can be queried for with specific remedies via:
- judgmentSource with a judgment source filter
- include for sources to include
- exclude for sources to exclude
- awardedToParties with specified party ID(s)
- awardedAgainstParties with specified party ID(s)
- judgmentSource from valid list at /list-judgment-sources
- nameType
- include for names and types to include
- exclude for names and types to exclude
- date
The “remedies” list in the query must contain objects that contain at least one of the above fields. When multiple are provided, all cases matching at least one will be found.
To query for cases with remedies awarded against Todd McFarlane, the following query would be used:
{
"remedies": [{
"awardedAgainstParties": [196852]
}
]
}
Damages
Damages are a specific class of remedy that provide for a monetary award. These can be queried similarly to findings. The values that can be included are:
- judgmentSource with a judgment source filter
- include for sources to include
- exclude for sources to exclude
- awardedToParties with specified party ID(s)
- awardedAgainstParties with specified party ID(s)
- judgmentSource from the valid list at /list-judgment-sources
- nameType
- include for names and types to include
- exclude for names and types to exclude
- date
- minimumAmount of the damage award in USD. Not that this value cannot include any formatting like a dollar sign or commas.
The “damages” list in the query must contain objects that contain at least one of the above fields. When multiple are provided, all cases matching at least one will be found.
To find all cases with monetary damages awarded to Apple Inc over $10,000 you would use:
{
"damages": [{
"awardedToParties": [2273],
"minimumAmount": 10000
}
]
}
Patents
Patents are a special set of data in the Lex Machina database. Cases can be queried for by the patent included in the litigation and can be either included or excluded similarly to many of the fields. This number will be an integer in the JSON object and is the same number as the patent office assigns only without any formatting.
Note that while we do have an endpoint to display information about patents at /patents/{patent_number} only patents that have been involved in litigation will be present in the Lex Machina data. Our data is not a full copy of what is available from the patent office, only that relevant to cases in our system.
To find cases involving the Personal Audio patent, use this query:
{
"patents": {
"include": [8112504]
}
}
Multidistrict Litigation
Federal cases can be searched for via MDL case numbers. These can be included or excluded in a query by providing an array of MDL numbers. A valid query can have an include or exclude array but not both.
{
"mdl": {
"include": [8112504, 765103]
}
}
Query Control
There are several options that allow you to control the number and manner of the returned results.
“ordering” will allow you to change the sort order of the results. By default, this value will be “ByFirstFiled”. The value “ByLastFiled” can be specified to change this.
“pageSize” controls how many results will be returned in a single query. The default is 5. This value must be greater than 0 and cannot be greater than 100.
“page” works with the previous value to support paging through large data sets. In a hypothetical query with exactly 900 returned cases, querying with a page size of 100 and page of 1 will return results 1 through 100 in the specified sort order. Querying with a page size of 100 and page of 2 will return results 101 through 200.
{
"ordering": "ByFirstFiled",
"page": 2,
"pageSize": 100
}
Tying it Together
With all of the above, now we can explore some more complex queries.
Query 1
For this query, let us look at the patent used above. We want all cases that:
- involved patent 8112504
- had monetary damages
- went to trial
{
"patents": {
"include": [8112504]
},
"damages": [{
"minimumAmount": 1
}],
"caseTags": {
"include": ["Trial"]
}
}
Query 2
For this query, we want all cases that:
- Were filed in the district of New Jersey
- Involve a contract dispute
- Were filed in calendar 2019
- Were terminated in calendar 2021
- Includes the maximum possible result set
{
"courts": {
"include": ["njd"]
},
"caseTypes": {
"include": ["Contracts"]
},
"dates": {
"filed": {
"onOrAfter": "2019-01-01",
"onOrBefore": "2019-12-31"
},
"terminated": {
"onOrAfter": "2021-01-01",
"onOrBefore": "2021-12-31"
}
},
"page": 1,
"pageSize": 100
}
Conclusion
The Lex Machina query language is a powerful tool for finding data of interest via the API. This is great for populating dashboards, creating notification mechanisms, identifying activity of interest in the court system or any other use of value to you.
By combining the various pieces into larger queries, you can create subqueries in small pieces and put them together as you see fit. It is like a Lego set for querying legal data!
If you have any challenges, issues or concerns always feel free to raise them to dslusher@lexmachina.com . Happy building and merry querying!