About DAR

What is DAR?

The Digital Assets Repository (DAR) is an eco-system of components developed by the International School of Information Science (ISIS) at the Bibliotheca Alexandrina (BA) to create an institutional repository maintaining the Library’s digital collections. DAR accommodates and archives any media type due to its flexible architecture. Moreover, it provides public access to digitized collections through a web-based search and browsing facility.

Why DAR?

DAR has been built mainly to support the creation, use and preservation of a variety of digital resources. It provides management tools which facilitate the process of creating, managing and sharing of the Library’s digital assets. The system is based on evolving standards and can easily be integrated with other systems.

Accessing Books on

DAR currently encompasses the largest Arabic book collection. For books that are out-of-copyright, their contents are fully available on the Internet. For books that are in-copyright, Internet users can browse only 5% of the book, with a minimum of 10 pages. Furthermore, for in-copyright books, the system allows simultaneous access according to the number of physical copies available at the BA. That is, if BA has purchased two copies of a book, only two users can access the digital copy simultaneously. Only when one of them releases the book, another user can have access to it.

DAR Main Usability Features

DAR provides different viewing options, searching for a keyword or expression, tagging, sharing books on other social networks, rating books, and interacting with other users through submitting comments. Users may also place books of their choice into different folders thus creating their own "Bookshelves". Annotation tools are available to provide highlighting, underlining certain spans of text, adding sticky notes, etc. Moreover, when a user searches for a book, the system displays several options to narrow down the search results; a process which is generally known as Faceted Search.

Technical Aspects of DAR

DAR Modules:

DAR consists of several modules:

The Digital Assets Factory (DAF): which provides flexible management for the digitization workflow, and a unified means of ingestion into the system. It supports both physical and born digital materials with different media types. It integrated easily with automated and human phases, checking integrity at each step of the workflow.

DAF is available for download at: DAFWiki

DAM (Digital Assets Metadata) manages the metadata of the objects within the repository. It consists of a metadata store for METS, in addition to using Fedora for metadata management. The system provides flexible metadata editing through the use of XML templates and dynamic forms. It also allows for synchronization with different ILS systems or other data sources (e.g. application backend) which is also based on XML templates.

The Digital Assets Keeper (DAK) is a storage layer for digital objects responsible for caching, versioning and load balancing.

A RESTful API for building applications on top of the Repository. Applications can query for new or updated metadata and files, and can also access a slice of the data in the Repository based on their access rights. This constitutes the Digital Assets Publishing layer (DAP)

A Discovery Layer provides full text search across the whole collection and is based on the access rights granted to the user. Full text search is built on Solr with support for 5 languages: Arabic, English, French, Spanish and Italian.

DAR is open source and has been deployed at the Bibliotheca Alexandrina’s Digital Laboratory since January 2007. DAFv2 manages the entire process of digitization, including its various phases, system users, files movement, archiving, and integration with the ILS and the Library digital repository. This version also supports workflow dynamic evolutions and deviation to allow for exception handling, and provides history tracking of actions and flexibility to simultaneously manage multiple projects with a diversity of materials. It also has the advantage of supporting the ingestion of a job in the middle of the workflow, and it allows easy integration of the tools used to perform functions of the workflow.

Optical Character Recognition

A digital Book Viewer displays the books based on the image-on-text technology. Research was carried out in co-operation with Arabic OCR producers in order to achieve efficient, high quality recognition for mass OCR production for Arabic content, reaching an accuracy ranging from 90% to 97%. Although the accuracy is not high enough to allow users to read the output of the OCR, it is good enough for searching. Therefore, the BA has concentrated its efforts into publishing books using the text layer behind the image, to allow for searching the text while exposing the image to the user. The full text content-based search is performed on the whole collection of available books.

The Book Viewer

The book viewer provides several features for the user’s convenience, such as:

  • Full text (morphological) search within the book's title, subject, keywords, and content;
  • Search results are highlighted within the book;
  • Single page or two page view;
  • Annotation tools: Highlighting , underlining and sticky notes;
  • Streaming; by displaying one page at a time to facilitate displaying the book over a slow Internet connection;
  • Multilingual interface.

The Digital Lab

DAR is also concerned with the digitization of materials already available in the Library or acquired from other institutions. A digitization laboratory was built for this purpose at the Bibliotheca Alexandrina. The lab is equipped with the state-of-the-art technologies for digitizing different types of material, including slides in multi formats, negatives, books, manuscripts, pictures and maps, audio and video. The complete cycle of the workflow to produce digital objects has been automated and integrated with the BA Library Information System.