- The first level is the content level that gives access to the actual content of web archive corpuses with various filter.
- The second level is the metadata level that gives access to two types of metadata. The ArcLink system extracts, preserves, and delivers the temporal web graph for the corpus. ArcLink was published as a poster in JCDL 2013 with my favorite minute madness and with a more detailed version as a tech report. ArcLink was presented in IIPC GA 2013 and received good feedback from the web archives consortium. The second type of metadata was thumbnails, we proposed thumbnails summarization techniques to select and generate distinguished set of pages that represent the main changes in the visual appearance of webpage through time. This work has been presented at ECIR 2014.
- The third level is URI level where we tried to extend the default URI lookup interface to benefit form the HTTP redirection. This research has been discussed in TempWeb 2013 and the full paper available in the proceedings.
- The fourth level is archive level where we quantified the current web archiving activities on two directions. The percentage of web archives materials regarding the live web corpus that was presented in JCDL 2011 and detailed version appeared as tech report. This work attracted the attention of various reporters to discuss it such as: The Atlantic, The Chronicle of Higher Education, and MIT Technology Review. The second direction was the distribution of web archives materials where we developed new methods to profile the web archives based on the TLD and languages. The work was presented at TPDL 2013, and an extended version with a larger dataset is accepted for publication in an IJDL special issue.