Jl. Medan Merdeka Selatan No. 3-5
Jakarta 10110, Indonesia
Phone (021) 310-2127, 314-4944, 334-236; Fax (62-21) 314-4945
April 13, 2009
Dear Sir,
The Library of Congress of the United States sends its greetings. We are the world’s archive whose mission is to make available and preserve knowledge and information through out the world in various media and format. The Library remains neutral on the content of information. We desire to support Indonesia’s entrance into the digital world. We initiated our Digital Initiatives in various ways, one of them is web harvesting various web-sites accessible in the internet to preserve the sites of information for archival, historical and scholarly purposes only.
During this upcoming 2009 Election, selective websites that provide significant content and carries the information on the election will be a resource for future generations that will be forgotten and lost, if not well archived and preserved.
It is important that your website/blog is included in this web-archiving project. It may add substantial content to the collection and archive. If you agree to participate, we will start to begin harvesting automatically upon your approval. We are also collecting election posters, handouts and other materials. We are not aware that any action and or changes are required on your part.
If you have any additional questions, comments and or recommendations, I welcome them.
Sincerely Yours,
William P. Tuchrello
Field Director – Attaché
Library of Congress - Southeast Asia Region
American Embassy – Jakarta
Sidik Rizal
Jl. Teuku Umar Kav. 16-17
Hotel Cibitung Indah
Gedung Paku Bekasi
dan berikut surat lampiran lainnya.
Web Capture Permissions Plan
Collection Name: Indonesia General Elections 2009
Sponsoring Division/Custodial: OvOps, Jakarta Office
The Library of Congress has overseas offices in six countries that collect from different regions from which traditional acquisitions are considered difficult. The Jakarta Overseas Operations Office has developed this Web Archive about the Indonesia General Elections of 2009. The collection substantially concentrates on the ongoing process of Parliamentary election, the Legislative election, and the Presidential election in Indonesia. The general (legislative) election is on April 9, 2009. The first round of Presidential Election is in July 2009, and second/final round is September 2009 (Presidential Election dates TBD). Sites will include campaign and party websites, and sites with information about bio-marine environment issues, an important campaign issue in 2009. Sites in this archive will be primarily Indonesian. Email notices will be sent in English but may also be translated into Indonesian if the project team decides that would improve chances for response.
As a part of any Web Capture project, permissions and notices must be considered prior to the start of the capture process. The OSI Web Capture team, using information provided by Library Services regarding types of sites to be included in the Indonesia General Elections 2009 collection, has developed this Web Capture Permissions Plan.
Collecting web-based material is an important mission of the Library and will be an on-going activity. As in previous collections, we would like to request blanket permission from Web site owners. A positive response to our blanket permission request would allow us to continue to collect the site until otherwise notified. If we do not receive a positive response, we would continue to notify Web site owners in future collections. The notification email will state that we are collecting for Indonesia General Elections Archive and will identify that the site may be collected as part of future Web archive projects.
1. Permissions Responsibilities
1.A. Project Team (OvOps/Jakarta):
Translate Permissions letter from English to Indonesian (if desired), deliver to OSI team
Perform research to located contact information (email)
Input this information into the Web Capture tool called the “Leaderboard” along with other selection information
In stage 2 of Leaderboard, click the “send” button to mail permissions
1.B. Capture and Technical team (OSI):
Set up the Leaderboard to be able to automatically mail notices to identified contacts, input translated letter once received
Track all responses, which may inform the capture process
Forward selection-related questions in response to notices to Indonesia Election project team.
2. Notices: For this collection, there will be two different types of notices sent to content owners, depending on the category or type of Web site. The two notices are Permission/Permission (Opt-in to crawl and display), and Notice/Permission (Notice of crawl, opt-in to display offsite).
2.A. Permission/Permission (Opt-in to crawl and display)
Notice will be sent to alert Web content owners about the project, the selection of their site, and the Library's desire to collect and make available to researchers onsite the LC campus and offsite via the Library’s public access Web site. Content owners must opt-in---LC must receive a positive "yes" from the content owner in order to both collect and display offsite.
2.A.1. Permission/Permission Guidelines
To be captured, sites in this category must meet the following criteria:
URL added to the Leaderboard (Recommenders)
Category has been assigned (Selection Coordinator)
Contact information has been located (Recommenders/Permissions person)
Notice has been mailed (Selection Coordinator/Permissions person via Leaderboard)
Crawl Permission has been granted (Content owner)
To be displayed to researchers’ offsite, sites in this category must meet all of the above, and the following additional criteria:
Display Permission has been granted (Content owner)
2.A.2. Categories of URLs in Permission/Permission group include:
2.A.3. Permission/Permission Notice Text
To Whom It May Concern:
The United States Library of Congress has selected your Web site for inclusion in its historic collections of Internet materials related to the Indonesia General Elections of 2009. The Library's traditional functions, acquiring, cataloging, preserving and serving collection materials of historical importance to the Congress and to the American people to foster education and scholarship, extend to digital materials, including Web sites. We request your permission to collect your web site and add it to the Library's research collections. We also ask that we be allowed to display the archived version(s) of your web site.
The following URLs have been selected:
With your permission, the Library of Congress or its agent will engage in the collection of content from your Web site at regular intervals over time. The Library will make this collection available to researchers’ onsite at Library facilities. The Library also wishes to make the collection available to offsite researchers by hosting the collection on the Library's public access Web site. The Library hopes that you share its vision of preserving Internet materials and permitting researchers from across the world to access them. If you agree to permit the Library to collect your Web site, please click the following link to signify your consent. This link also includes a separate consent for permitting the Library to provide offsite access to your materials through the Library's Web site.
For several years, the Library of Congress has collected Web sites within certain themes or topics for which we were required to seek permission for each new collection developed by the Library, even if permission had been granted in the past. As our collections have grown, we have had to contact some Web site producers repeatedly. To reduce this duplication and to save site owners from having to respond to multiple requests for information, we are now requesting permissions for the Library to collect, over time and in varying frequency, sites of research interest. Your site has been identified as a Web site of interest related to the Indonesia General Elections 2009. If you grant this permission, we will capture your site for inclusion in the Indonesia General Elections 2009 Web Archive and may include it in future collections. If you grant this permission, and in the future you no longer, wish to be included in the Library's Web archives, please contact us and we will cease collection of your URL.
Our Web Archives are important because they contribute to the historical record, capturing information that could otherwise be lost. With the growing role of the Web as an influential medium, records of historic events could be considered incomplete without materials that were "born digital" and never printed on paper. The Library has developed previous Web Archives, some of which are available through the Library's Web site ( For more information about these Web Archive collections, please visit our Web site.
If you have questions, comments or recommendations concerning the Indonesia General Elections 2009 project or future projects, please e-mail the Library's Web Capture team at at your earliest convenience, or contact the Jakarta Office staff at
Thank You,
Web Capture Team
Library of Congress
Washington, D.C.
2.B. Notice/Permission (Notice of crawl, opt-in to display offsite)
Notice will be sent to alert Web content owners about the project, selection of their site, and the Library's plans to make the collection available to researchers’ onsite the LC campus and offsite via the Library’s public access Web site. Positive response will be required for offsite display but not for collection. If no response is received, the Web site will be archived, but will not be made available for researchers’ offsite. If the content owners do respond, the response will include the understanding that that the Library may engage in collecting the site for future Web archives.
2.B.1. Notice/Permission Guidelines
To be captured, sites in this category must meet the following criteria:
URL added to the Leaderboard (Recommenders)
Category has been assigned (Selection Coordinator)
Contact information has been located (Recommenders/Permissions person)
Notice has been mailed (Selection Coordinator/Permissions person via Leaderboard)
To be displayed to researcher’s offsite, sites in this category must meet all of the above, and the following additional criteria:
Display Permission has been granted (Content owner)
2.B.2. Categories of URLs in the Notice/Permission group include:
Educational/Research Organizations
Political Commentary
Political Party
Watchdog, Public Policy, Political Advocacy Organizations
X (maybe a few)
Religious Organizations
Support Groups
2.B.3. Notice/Permission Text
To Whom It May Concern:
The United States Library of Congress has selected your Web site for inclusion in its historic collections of Internet materials related to the Indonesia General Elections of 2009. The Library’s traditional functions, acquiring, cataloging, preserving and serving collection materials of historical importance to the Congress and the American people to foster education and scholarship, extend to digital materials, including Web sites.
The following URL has been selected:
The Library of Congress or its agent will engage in the collection of content from your Web site at regular intervals. The Library will make this collection available to researchers’ onsite at Library facilities.
The Library also wishes to make the collection available to offsite researchers by hosting the collection on the Library's public access Web site. The Library hopes that you share its vision of preserving materials about Indonesia General Election 2009 and permitting researchers from across the world to access them.
For several years, the Library of Congress has collected Web sites within certain themes or topics. As our collections have grown, we have had to contact some Web site producers repeatedly. To reduce this duplication and to save site owners from having to respond to multiple requests for information, we are now giving notice that the Library will collect, over time and in varying frequency, sites of research interest. Your site has been identified as a Web site of interest related to the Indonesia General Elections 2009. If you grant permission to display offsite access to your materials through the Library’s Web site, we will take your permission as notice that we may include it in our future collections. If in the future you no longer wish your materials to be displayed offsite or if you wish to be contacted for each new collection, please contact us.
Our Web Archives are important because they contribute to the historical record, capturing information that could otherwise be lost. With the growing role of the Web as an influential medium, records of historic events could be considered incomplete without materials that were "born digital" and never printed on paper. The Library has developed previous Web Archives, some of which are available through the Library's Web site ( For more information about these Web Archive collections, please visit our Web site.
If you have questions, comments or recommendations concerning the Indonesia General Elections 2009 project or future projects, please e-mail the Library's Web Capture team at at your earliest convenience, or contact the Jakarta Office staff at
Thank You,
Web Capture Team
Library of Congress
Washington, D.C.
2.C. No Notice: For federal government URLs, no notice is required as they are in the public domain.
To be captured and displayed to researcher’s offsite, sites in this category must meet the following criteria:
URL added to the Leaderboard (Recommenders)
Category has been assigned (Selection Coordinator)
Web Capture team is notified that Federal Government site is ready to be manually processed (Selection Coordinator/Permissions person)
Notice Type
Government – Foreign
No Notice
3. Robots.txt
While lawyers and courts like using robots.txt as a proxy for copyright permissions that is not the way web managers use it. Many sites include robots.txt instructions for non-copyright reasons (e.g., reducing server loads from the crawl). Second, many sites apply the restrictions only to parts of the site, not to the whole site; further confusing matters (although one could deem that a full-site restriction is more likely copyright-based.) Finally, and perhaps most importantly, this is not something that can be assessed at the selection staff level, which makes the decisions regarding the level of permission that is required but lacks the tech expertise to assess the robots.txt exclusion.
Therefore, the following plan has been established:
1) Sites will be crawled in accordance with this permissions plan.
2) Robots.txt instructions will be captured (but not respected) during the crawl
3) OSI will document robots.txt throughout the process, including monitoring any correlation between the robots.txt and permissions received or denied.

