Sunday, January 30, 2011

Fw: How to turn texts from the Digital Library of India into pdf-s

----- Original Message -----
From: "Sumit Guha" <sguha@HISTORY.RUTGERS.EDU>
To: <H-ASIA@H-NET.MSU.EDU>
Sent: Monday, January 31, 2011 3:21 AM
Subject: How to turn texts from the Digital Library of India into pdf-s


> Dear colleagues,
> About a year ago Jon Keune, of Academia Sinica and Columbia University
> sent me a useful set of instructions on how to access the valuable
> resources of the Digital Library of India. At my request, he is now
> sending them out for the H-ASIA list.
>
> I am sure that you will all join me in thanking Jon for his help.
>
> Best wishes, Sumit Guha
> Rutgers University
> *************************************************************************
> Dear H-ASIA colleagues,
>
> The holdings of the Digital Library of India are fantastic, but some may
> find the DLI interface for viewing single images cumbersome and prefer
> working with a PDF file. A piece of freeware for Windows called "DLI
> Downloader" ( http://sanskritdocuments.org/scannedbooks/dlidownloader/ )
> claims to help with this, but I haven't been able to get it to work for
> me. Instead, I've been using a slightly more involved process to create
> PDF's from the individual images in DLI. It was suggested to me that some
> members of H-ASIA may find this useful, so I am sharing it here. Caveat:
> I'm using a PC and Adobe Acrobat (Full, not Reader) for this process. I
> can't speak to how or whether this will work on other platforms and with
> other PDF creation programs.
>
> To begin, you'll need to download all the individual images of the desired
> text from DLI. The freeware program called "LTVT Image Grabber" can do
> this after its default settings are changed as I describe below. The
> program can be downloaded in the file "Image_Grabber.zip" at
> http://ltvt.wikispaces.com/Utility+Programs . After downloading, extract
> Image Grabber from the ZIP file onto your computer and follow these steps:
>
> 1. In an internet browser (I use Firefox), locate the desired book in DLI
> and open the first page image by clicking on the "BookReader-1" link. This
> will open a new tab or window with a single page of the book. At the
> center of the bottom of the screen, to the right of the navigation arrows
> and page number indicator ("X of Y Pages") is a menu box with options like
> PTIFF, HTML, TXT, RTF and Meta. Ensure that PTIFF is selected.
>
> 2. Note down the final page number (the Y value in the "X of Y Pages"
> area) between the navigation arrows at the bottom center of the page.
> You'll need this for step 7.
>
> 3. Right-click somewhere on that browser page and select "Copy Image
> Location" (in Firefox, at least) to copy the image URL to the Windows
> clipboard. Note: this will not be the same URL that is listed in the web
> browser address field at the top of the window.
>
> 4. Start up the LTVT Image Grabber program and click on the tab "Numeric
> Sequence" near the top of the window in order to reach the settings that
> need to be altered.
>
> 5. In the field labeled "Number Prefix" replace whatever is there with the
> URL copied from the browser in Step 3.
>
> 6. Make the following changes in Image Grabber:
> 6a. In the field "Number suffix" type ".tif" (without quotation marks)
> 6b. In the field "Downloaded file name suffix" also type ".tif" (again, no
> quotes)
> 6c. Click on the Destination Folder button to choose where you want the
> downloaded images to be saved. Ultimately each individual downloaded image
> will end up in this folder, with the name "AS15-M-" followed by the
> downloaded page number.
>
> 7. Also in Image Grabber:
> 7a. In "Start #" enter the first page number (usually 1)
> 7b. In "End #" type in the number that you observed in step 2.
> 7c. In "# Digits" enter the number of digits in the final page number
> (usually 3, or occasionally 4 for very large files)
>
> 8. Return to the "Number Prefix" field where you pasted the DLI image
> address in step 5.
> 8a. Delete the ".tif" from the end of the address
> 8b. Delete the final two, three or four digits of the remaining address,
> corresponding to the number you typed in "# Digits" in step 7c. If this
> step isn't followed exactly, Image Grabber will give error messages when
> you try to download.
>
> 9. Click on the "Retrieve Files" button at the lower left of the Image
> Grabber window, and the downloading will begin. It may take some time for
> all of the images to download, depending on the size of the book and
> internet connection speed.
>
> 10. Once the downloading is finished, open Adobe Acrobat and go to File |
> Create PDF | From Multiple Files. This will call up a small window asking
> where to look for the files. Navigate to the destination folder you
> entered in step 6c, select all the images to be added to your new PDF
> document, and follow through with the rest of the self-explanatory Acrobat
> process to create a single PDF from all the individual image files.
>
> This may seem complicated at first glance, but after a couple tries the
> logic of it will be obvious.
>
> Best regards,
> Jon
>
>
> Jon Keune
> Ph.D. candidate (ABD)
> Religion Department
> Columbia University, New York City
>
> Visiting Research Associate
> Institute of Ethnology
> Academia Sinica
> Taipei, Taiwan
>
>
>
> --
> http://history.rutgers.edu/index.php?option=com_content&task=view&id=159&Itemid=140

No comments:

Post a Comment