Thursday, April 20, 2006

Endeavor Voyager EndUser 2006




April 20, 2006

Keynote speaker – Tom Turvey, strategic partner development manager Google
Google Books Google Scholar

· Book search launched at Frankfurt book fair October 2004
· People search for everything
· Best for finding archive information and linking it with the reader
· Google understands that not everything is on the internet
· Books.google.com Integrated into search
· Direct to page of book that is relevant

Partner Programme
· Partner programme has 1000s of partners
· Partner brand on each page
· Partner is top line on "buy this book"
· 20% viewable / month / viewer
· Publisher gets revenue from clicks
· Can view two pages each direction
· Very low resolution (72 dpi) – can't print copy or save
· A percentage of every book is never shown
· five publishers suing are partners in partnership programme

Library Programme
· four libraries in US (Stanford, Harvard, NYPL, Michigan) and one in UK (Oxford)
· 60% of books are in only one library
· 40% redundancy rate is ok by them
· Only 20% of a library is in public domain
· 10% are in print
· 90% are out of print (in or out of copyright)
· 70% of books published after 1923 may still be in copyright

· Three types of views
· From publisher programme – sample pages
· From library programme – under copyright no agreement – snippets
· From library programme – public domain – full book, no restrictions

· Snippet – full book is scanned – only three small parts shown no matter how many search matches – links to book sellers and OCLC
· Technical challenges – how to transport books safely from libraries
· Marginalia considered to be of value
· Full text is searchable
· All meta data considered valuable
· Goal is a comprehensive collection -- know that they are not there yet
· Over 100 languages
· Challenges – 100% accuracy in OCR – 100% image quality -- Web integration -- Metadata accuracy – multiple language support – scanning speed and automation
· Page feedback for quality on each page
· Math formulas are a problem for OCR
· They listen to librarians
· They go to library conferences
· They want to hear from us


Questions

They are working on advanced search options

Publisher books – don't have link to worldcat – if it is scanned as well it will have worldcat link – publisher want to sell books so they don't want link to library – a known issue – only some publishers have problems

Library of Congress subject headings – some are used as ranking tool behind scenes – maybe – on roadmap but no concrete plans

Do they have a digital preservation policy? – no – not part of their job – agreement with library

Metadata – used as placeholder – bias is to full text indexing – metadata second

Fair use – lean currently to publisher half of rights – not working to user rights (scholar use)

Worldcat relationship only – would like to use other and would like to use and hear about them but currently to link to library need to be in worldcat

Google has many things now – video, scholar, book, etc. – do they plan on going to one box for all search – they have references to others and are working on a combination of indexes but not yet



Tom Turvey was funny, personable, a great speaker, calm under questioning, forthcoming when he could be, etc.
If you get a chance to hear him, do so.

No comments:

Post a Comment