US court records are not copyrighted, but the US court system operates a paywall called "PACER" that is supposed to recoup the costs of serving text files on the internet; charging $0.10/page for access to the public domain, and illegally profiting to the tune of $80,000,000/year.
The response to PACER is RECAP, a browser plugin that captures all the pages anyone pays for in PACER and puts them in a free repository mirrored on the Internet Archvie that anyone can access for free. Among other things, RECAP revealed that the courts were failing in their duty to remove sensitive personal information (like Social Security Numbers or the home addresses of stalking survivors) from their records. Aaron Swartz was key in revealing the scandal of PACER, and it cost him the ire of the federal prosecutors who later hounded him to his suicide, so further editions of RECAP were dedicated to his memory.
Now the Free Law project has made the most significant advance in RECAP to date: liberating "approximately 3.4 million orders and opinions from approximately 1.5 million federal district and bankruptcy court cases dating back to 1960," and doing text-extraction on older files that were served as bitmaps, making them fully searchable.
At Free Law Project, we have gathered millions of court documents over the years, but it’s with distinct pride that we announce that we have now completed our biggest crawl ever. After nearly a year of work, and with support from the U.S. Department of Labor and Georgia State University, we have collected every free written order and opinion that is available in PACER. To accomplish this we used PACER’s “Written Opinion Report,” which provides many opinions for free.