October 28, 2008

Google Books — More Open Access?

Posted by John Baez

News flash!

A while back, various parties including five companies in the Association of American Publishers sued Google over their ‘Google Book Search’ feature. But now they’ve reached a settlement, which seems likely to affect us all.

It’s a complicated case, and the settlement is about 130 pages long. I don’t really understand it. But…

For one thing, Google coughed up a little cash — $125 million — to set up a Book Rights Registry, which will resolve existing claims by authors and publishers and cover legal fees.

But as another part of the deal, the Authors Guild and the Association of American Publishers have agreed to work with Google to expand their Book Search project — since the settlement acknowledges the rights and interests of copyright holders, and lets them get paid for online access.

Google is putting a positive spin on it all. They say that the new deal will give:

  • More access to out-of-print books — generating greater exposure for millions of incopyright works, including hard-to-find out-of-print books, by enabling readers in the U.S. to search these works and preview them online.
  • Additional ways to purchase copyrighted books — building off publishers’ and authors’ current efforts and further expanding the electronic market for copyrighted books in the U.S. by offering users the ability to purchase online access to many in-copyright books.
  • Institutional subscriptions to millions of books online — offering a means for U.S. colleges, universities and other organizations to obtain subscriptions for online access to collections from some of the world’s most renowned libraries.
  • Free access from U.S. libraries — providing free, full-text, online viewing of millions of out-of-print books at designated computers in U.S. public and university libraries.
  • Compensation to authors and publishers and control over access to their works — distributing payments earned from online access provided by Google and, prospectively, from similar programs that may be established by other providers, through a newly created independent, not-for-profit Book Rights Registry that will also locate rightsholders, collect and maintain accurate rightsholder information, and provide a way for rightsholders to request inclusion in or exclusion from the project.

Also, with the lawsuits settled, libraries at the Universities of California, Michigan, Wisconsin, and Stanford are expected to make their collections available to Google Book Search. Here’s a press release issued by these universities:

Major Universities See Promise in Google Book Search Settlement

ANN ARBOR, Mi; PALO ALTO, Ca, OAKLAND, Ca - The University of California, University of Michigan, and Stanford University announce today their joint support for the outstanding public benefits made possible through the proposed settlement agreement submitted to the United States District Court, Southern District of New York by Google Inc. and plaintiffs the Authors Guild, Inc. et al.

The proposed settlement will expand access to books in the Google Book Search project. Google Book Search is an ambitious project to digitize the print collections of the world’s greatest libraries and make them searchable via the Internet. The project will make it possible for libraries to preserve millions of books and assure numerous other public and academic benefits.

“Millions of books are held in our libraries as a public trust,” said Daniel Greenstein, Vice Provost at the University of California. “This settlement will help provide broad access to them as well as other public benefits, and it also promises to promote innovation in scholarship. For these reasons, UC is pleased to have given input along with Universities of Michigan and Stanford in support of the public good, and we look forward to playing a continuing role by contributing UC library volumes to the development of this rich online resource.”

The universities were not direct parties to the agreement, and there are some aspects of it the universities would change; however they believe it is favorable overall to the principles and intentions that led them to join the program as early as 2004.

Among the important benefits to higher education are:

l Free full text access at public libraries around the country

l Free preview and ability to either find the book at a local library or through a consumer purchase.

l A first-ever database of both in-copyright and out-of-copyright (public domain) works on which scholars can conduct advanced research (known as the “the research corpus”). For example, a corpus of this sort will allow scholars in the field of comparative linguistics to conduct specialized large scale analysis of language, looking for trends over time and expanding our understanding of language and culture.

l Enabling the sharing of public domain works among scholars, students and institutions. Not only will scholars and students at other universities be able to read these online, but this will make it possible to provide large numbers of texts to individuals wishing to perform research;

l Institutional subscriptions providing access to in-copyright, out-of-print books;

l Working copies of partner libraries’ contributed works for searching and web services complementary to Google’s.

l Accommodated services for persons with print disabilities — making it possible for persons with print disabilities to view or have text read with the use of reader technology;

l Digital copies of works digitized by Google provided to the partner libraries for long term preservation purposes. This is important because, as university libraries, we are tasked by the public to be repositories of human knowledge and information.

I’d like to know more about this!

Posted at October 28, 2008 6:52 PM UTC

Re: Google Books — More Open Access?

My experience is that most scans at Google Books now are terrible with their low resolution and overall low quality.

This is also true for many commercial journals who have scans of their old issues. For example, most libraries do not have access to the prior to 1993 issues of Journal of Pure and Applied Algebra. I was 2 week a guest at Newton Institute at Cambridge which has access to JPAA and downloaded about 15 papers from issues of my interest; however I observed dirty (with spurious shades)
and low resolution scans with hardly recognizable math symbols (the price for single-paper downloads of non-subscribers being enormous). After obtaining so
a legal copy quality paid copy of R. Street’s paper
“Formal theory of monads” from 1972, I got upset and scanned another hi-quality copy for myself from a paper issue of the journal. Now I own a bad legal and good illegal copy of the same paper, the same pages. What is the legal status of having both ? Funny how stupid laws the mankind created, seemingly for its own “good”. I know some good stores which give you items which you bought spoiled, after-expiration-date, with errors or sold for illegal prize by clerical error for free.

Should I therefore post the scan above publicly; do they consider my time spent reading unreadable characters a collateral damage ?

Posted by: Zoran Skoda on October 29, 2008 2:19 PM | Permalink | Reply to this

Re: Google Books — More Open Access?

Zoran wrote:

Should I therefore post the scan above publicly […] ?

Since doing this may be illegal, I can’t advise you to do it.

However, a famous topologist (who probably wants to stay anonymous in this context) has suggested that scientists use a peer-to-peer filesharing system like Bittorrent to illegally distribute old math papers and circumvent the overly expensive journals — just like teenage kids do for pop music.

The Byelorussians are already doing something similar, by making large numbers of expensive math and physics books freely available in djvu format.

But, I suspect that scientists have more to lose from this sort of mass rebellion than teenage music fans do. The publishers don’t seem interested in suing anyone for emailing someone else a copy of a published paper, even if it’s technically illegal. But if someone started spreading around hundreds of papers, and got caught, they might get sued — and since academics are fairly law-abiding and cautious people, this might be enough to put a lid on the practice.

Posted by: John Baez on October 29, 2008 4:27 PM | Permalink | Reply to this

Re: Google Books — More Open Access?

Posted by: Blake Stacey on October 29, 2008 9:28 PM | Permalink | Reply to this

Re: Google Books — More Open Access?


Is the low quality text scan that you’re complaining about low resolution text or blurry text? I’ve seen many blurry scans of old mathematics papers, and was puzzled about that. I once tried scanning some old typewriter text of my own at screen type resolution and discovered that the results were very resistant to compression in PDF or JPG format. In retrospect, I wonder if the blurriness was intentionally introduced in order to allow much deeper compression by the compression algorithms so that the document can be more efficiently transmitted over the internet. If my scanner weren’t buried in the closet in a box I would test this theory by rescanning and blurring the images in one of my photo editing programs before attempting to compress them.

Posted by: Richard on October 30, 2008 3:56 AM | Permalink | Reply to this

Re: Google Books — More Open Access?

I’ve long felt that word recognition software, turning scans into easy .tex, is the real solution. The problem is nothing like that exists yet as far as I know, although I know that google has plenty of proprietry word recognition stuff.
I think it’s do-able if google (or AMS) are interested… and I hope I live to see it get done!

Posted by: Daniel Moskovich on October 31, 2008 11:57 AM | Permalink | Reply to this

Re: Google Books — More Open Access?

Librarians have joined the fray. Many of them are quite worried that Google may now effectively ‘own’ most books whose copyright has expired. Here’s something from an email I got about this:

Library associations ask judge to assert vigorous oversight of proposed Google Book Search Settlement

WASHINGTON, DC – The American Library Association (ALA), the Association of College and Research Libraries (ACRL) and the Association of Research Libraries (ARL) today filed comments with the U.S. District Court for the Southern District of New York for the judge to consider in his ruling on the proposed Google Book Search Settlement. The associations asked the judge to exercise vigorous oversight of the interpretation and implementation of the settlement to ensure the broadest possible benefit from the services the settlement enables.

Representing over 139,000 libraries and 350,000 librarians, the associations filed the brief as members of the plaintiff class because they are both authors and publishers of books. The associations asserted that although the settlement has the potential to provide public access to millions of books, many of the features of the settlement, including the absence of competition for the new services, could compromise fundamental library values including equity of access to information, patron privacy and intellectual freedom. The court can mitigate these possible negative effects by regulating the conduct of Google and the Book Rights Registry the settlement establishes.

While this settlement agreement could provide unprecedented access to a digital library of millions of books, we are concerned that the cost of an institutional subscription may skyrocket, as academic journal subscriptions have over the past two decades,” Erika Linke, president of ACRL, said.

Under the settlement, Google, the Association of American Publishers and the Authors Guild resolved their legal dispute over the scanning of millions of books provided by research libraries. The library associations are not asking the judge to reject the settlement. Instead, they are requesting the judge to carefully monitor the parties’ behavior once the settlement takes effect.

Jim Rettig, president of ALA, said the proposed settlement, “offers no assurances that the privacy of what the public accessed will be protected, which is in stark contrast to the long-standing patron privacy rights libraries champion on behalf of the public.”

Although the filing deadline for comments to the judge was recently extended by four months, the associations moved forward with filing by the original deadline to help inform the public as it considers this important and complex matter.

“The filing before the court by the library associations demonstrates that the associations will be vigilant in highlighting the interests of the public in this settlement. We have asked the court to exercise vigorous oversight to ensure that the powerful groups that control content do not leave individual researchers, libraries, other cultural organizations and the public without an effective voice,” Tom Leonard, president of ARL, said.

The library associations filing can be viewed on the ALA or ARL Web sites.

Posted by: John Baez on May 6, 2009 4:53 PM | Permalink | Reply to this

Re: Google Books — More Open Access?

The head of our university library at Stellenbosch here in South Africa, Ellen Tise, is becoming the president of the International Federation of Library Associations (IFLA) today. She has stated her presidential theme as focusing specifically on “client-orientation, the library as a physical place, the role of libraries in community inclusivity and the need for strong partnerships”. I’m not sure where she stands on issues like Google Books or the monopoly publishing houses like Springer and Elsevier, but I’d love to find out!

Posted by: Bruce Bartlett on August 27, 2009 9:36 AM | Permalink | Reply to this

Re: Google Books — More Open Access?

Ask sometime! I’ve found that librarians tend to be very nice people (as long as you don’t talk too loud ) and they genuinely enjoy it when faculty take an interest in library issues. After all, they’re used to being taken for granted except when somebody has a complaint!

Posted by: John Baez on August 27, 2009 8:37 PM | Permalink | Reply to this

Re: Google Books — More Open Access?

The latest news:

Published: May 20, 2009K

SAN FRANCISCO — In a move that could blunt some of the criticism of Google for its settlement of a lawsuit over its book-scanning project, the company signed an agreement with the University of Michigan that would give some libraries a degree of oversight over the prices Google could charge for its vast digital library.

Google has faced an onslaught of opposition over the far-reaching settlement with authors and publishers. Complaints include the exclusive rights the agreement gives Google to publish online and to profit from millions of so-called orphan books, out-of-print books that are protected by copyright but whose rights holders cannot be found.

The Justice Department has also begun an inquiry into whether the settlement, which is subject to approval by a court, would violate antitrust laws.

Google used the opportunity of the University of Michigan agreement to rebut some criticism.

“I think that it’s pretty short- sighted and contradictory,” said Sergey Brin, a Google co-founder and its president of technology. Mr. Brin said the settlement would allow Google to offer widespread access to millions of books that are largely hidden in the stacks of university libraries.

“We are increasing choices,” Mr. Brin said. “There was no option prior to this to get these sorts of books online.”

Under Google’s plan for the collection, public libraries will get free access to the full texts for their patrons at one computer, and universities will be able to buy subscriptions to make the service generally available, with rates based on their student enrollment.

The new agreement, which Google hopes other libraries will endorse, lets the University of Michigan object if it thinks the prices Google charges libraries for access to its digital collection are too high, a major concern of some librarians. Any pricing dispute would be resolved through arbitration.

Only the institutions that lend books to Google for scanning — now 21 libraries in the United States — would be allowed to object to pricing.

The new agreement also gives the university, and any library that signs a similar agreement, a discount on its subscription proportional to the number of books it contributes to Google’s mass digitization project. Since Michigan is lending a large number of books, it will receive Google’s service free for 25 years.

“This agreement gives us a number of things in the context of the settlement that are valuable to us and we think are valuable to other libraries,” said Paul Courant, dean of libraries at the University of Michigan.

The American Library Association, which has asked the court to oversee aspects of the settlement, said the new agreement is a step in the right direction but is insufficient to ensure that Google does not set artificially high prices for its digital collection.

“Any library must have the ability to request that the judge review the pricing should a dispute arise,” said Corey Williams, associate director at the association’s Washington office.

Since libraries that contribute books will receive discounts, they may have fewer incentives to complain about prices.

The new agreement does not address other criticism, including the complaints over orphan works and worries that the agreement does not protect the privacy of readers of Google’s digital library.

The University of California is one of the institutions that lent books to Google for scanning.

Posted by: John Baez on May 21, 2009 7:05 AM | Permalink | Reply to this

Re: Google Books — More Open Access?

In the long run, what matters here is that the books are getting scanned.

Universities and colleges are the main centres today of digital copyright piracy infringement just not giving a damn. Once Google puts these texts out there, they will become available to all.

Posted by: Toby Bartels on May 22, 2009 1:38 AM | Permalink | Reply to this

Re: Google Books — More Open Access?

(The word ‘piracy’ above should also be struck out, just like ‘infringement’; I forgot to finish proofreading.)

Posted by: Toby Bartels on May 22, 2009 1:40 AM | Permalink | Reply to this

Re: Google Books — More Open Access?

I should add that, for old books out of copyright, sharing Google's scans will be perfectly legal. For orphaned books, it will be as legal as the scans themselves. Sure, Google may complain about violating their Terms of Service (depending on just what those are), but if you get the scan from a friend and never agree to Google's ToS, then you're not liable. It would be easy to set up a BitTorrent site to redistribute Google scans that would be on far surer legal ground than, say, YouTube.

Posted by: Toby Bartels on May 22, 2009 6:28 PM | Permalink | Reply to this

All About the Orphans; Re: Google Books — More Open Access?

Brewster posted “It’s All About the Orphans” on the blog of the Open Content Alliance, focusing on the plight of “orphan works” - that vast number of books that are still under copyright but whose authors can no longer be found:

“After digesting the proposed Google Book Settlement, it becomes clear that the dizzyingly complex agreement is, in essence, an elaborate scheme for the exploitation of orphan works… The upshot, if the Settlement is approved, would be legal protection for Google, and only for Google, to scan and provide digital access to the orphan works. Presto! … So, should the Settlement be approved, Google will be handed exclusive access to the orphans, and the public loses out… I, personally, am amazed at this creative use of class action law. The three parties have managed to skirt copyright law, bypass legislative efforts, and feather their own nests - all through the clever use of law intended to remedy harms. This Settlement, if approved by the judge, will accomplish things appropriate to a legislative body not to private corporate boardrooms. Let’s live under the rule of law, as arduous as that might be, and free the orphans, legitimately, not for one corporation but for all of us.”

Posted by: Jonathan Vos Post on May 23, 2009 11:33 PM | Permalink | Reply to this

Re: Google Books — More Open Access?

For orphaned books, it will be as legal as the scans themselves

As Jonathan reminds me, this is not true, thanks to Google's cushy class-action settlement.

Reader: Have you written and published a book? Or maybe you just wrote the introduction? Or a paper that was included in a (non-periodical) book? (It doesn't matter in what country.) Then you're a party to this lawsuit.

Posted by: Toby Bartels on May 24, 2009 2:01 AM | Permalink | Reply to this

Re: Google Books — More Open Access?

You folks have probably seen that the Google Books settlement is attacting more attention lately… but if not, try this:

Sample quote:

Opponents and supporters of Google’s plans are lining up for a showdown that will come to a head on 4 September, the deadline for submissions to be lodged with a Manhattan court that is reviewing the scheme, known as Google Book Search.

The court is considering whether or not to give the go-ahead to the settlement of a class-action suit that Google reached with key publisher and writers’ groups last October. The settlement, approved by the Authors Guild and the Association of American Publishers, provides for a pot of $125m (£75m) that Google has agreed to pay to cover copyright infringements it had already committed by scanning books online.

The settlement would also give writers and publishers the equivalent of 63% of future revenues generated by sales of digital books and other income, while Google would keep the remaining 37%.

And this:

Sample quote:

In the latest objection, Scott E. Gant, an author and partner at Boies Schiller & Flexner, a prominent Washington law firm, plans to file a sweeping opposition to the settlement on Wednesday urging the court to reject it.

“This is a predominantly commercial transaction and one that should be undertaken through the normal commercial process, which is negotiation and informed consent,” Mr. Gant said in an interview. Google and its partners are “trying to ram this through so that millions of copyright holders will have no idea that this is happening.”

Unlike most previous objections to the project, which focused on policy issues and recommended modifications to the settlement, Mr. Gant argues that the agreement, which gives Google commercial rights to millions of books without having to negotiate for them individually, amounts to an abuse of the class-action process. He also contends that it does not sufficiently compensate authors and does not adequately notify and represent all the authors affected.

Posted by: John Baez on August 27, 2009 8:46 PM | Permalink | Reply to this

Re: Google Books — More Open Access?

You folks have probably seen that the Google Books settlement is attacting more attention lately…

Of course I've noticed; it's not as if I get most of my news about this sort of thing from this blog. The idea is laughable! <g>

Much as I hope for the success of Google Book Search, and much as I hate to see copyright law used to limit the dissemination of information, the implications of this case go beyond copyright law; it seems as if it would set a bad precedent for class-action lawsuits as a way to undertake commercial transactions. (But perhaps that precedent is already established?)

Posted by: Toby Bartels on August 27, 2009 9:42 PM | Permalink | Reply to this

Re: Google Books — More Open Access?

I’ve been suspicious of this whole thing since trying to understand the particular class action suit at the centre of it. As far as I can make out the structure of the suit is:

The suit is brought on behalf of Class A (in this case, authors of orphaned works). But no member of Class A actually appears in court, or is even aware of the suit. Instead, their interests are “represented” in court by members Class B (in this case, the Authors Guild), with which they have empty intersection. And Class B determines that what would be most in the interests of Class A would be for all the defining rights of members of Class A (and the resulting potential income) to be transferred to class B! It’s as if burglars brought a class action suit “on behalf of” householders, arguing that it’s in the best interests of householders for their jewellery and TVs to be taken away by burglars. It’s almost parodically corrupt.

I’d much rather that the relevant rights were abolished (i.e. the period of copyright was shortened to a defensible length) than that a precedent be set that some random group of self-interested carpet-baggers be granted the right to appropriate anybody else’s property that happens to be lying around undefended and sell it for their own profit.

There’s surely got to be a better way to do this.

Posted by: Tim Silverman on August 28, 2009 12:25 PM | Permalink | Reply to this

Re: Google Books — More Open Access?

Another recent wrinkle: a group of University of California faculty, while not opposing the proposed settlement with the Authors Guild, has pointed out that the settlement is oriented toward the priorities of authors who make their living from royalties. Academic authors, on the other hand, typically prioritize wide dissemination over profit; academic works furthermore account for most of the books in question. See this blog post.

Posted by: Mark Meckes on August 28, 2009 12:43 PM | Permalink | Reply to this

Re: Google Books — More Open Access?

Some news from Associated Press:

Sep 10th, 2009 | MOUNTAIN VIEW, Calif. – Google will let other online companies sell its digital copies of out-of-print books if a class-action settlement with U.S. authors and publishers wins court approval.

The company announced that concession Thursday after mounting opposition to Google’s 10-month-old settlement. Among other things, opponents of the deal argue it would give Google a digital monopoly on millions of books that are no longer being published.

Google now says it will give and other rivals to its digital library of out-of-print books. The other merchants would then be allowed to keep most of the revenue from the sales.

Google announced the change shortly after the head of the U.S. Copyright Office advised a congressional committee that parts of the settlement violate federal law.

Posted by: John Baez on September 10, 2009 8:32 PM | Permalink | Reply to this

