Karl Blumenthal

  • Total activity 36
  • Last activity
  • Member since
  • Following 0 users
  • Followed by 0 users
  • Votes 0
  • Subscriptions 16

Activity overview

Latest activity by Karl Blumenthal
  • Karl Blumenthal created an article,

    How to download and open ARCH datasets

    Overview You can download your ARCH datasets to work with them locally from the command line, desktop, or other web-based programs. Follow the instructions below to download each dataset from the I...

  • Karl Blumenthal created an article,

    Tutorial: Explore web archive data from the command line with Jupyter Notebooks

    <<< Back to the guide, "Sample ARCH datasets and how to explore them." Introduction Browser-based tools like those included in the above tutorials can help you to examine and visualize relatively s...

  • Karl Blumenthal created an article,

    Quick guide to using ARCH

    Introduction ARCH (Archives Research Compute Hub) is a research and education service that helps users build, access, and analyze digital collections computationally at scale. ARCH is configured cu...

  • Karl Blumenthal created an article,

    How to publish ARCH datasets to archive.org

    Overview You may publish any ARCH dataset as a publicly accessible item on archive.org in order to share and cite it. Follow the instructions below to find the publishing feature, add descriptive m...

  • Karl Blumenthal created an article,

    How to create a custom ARCH collection

    Overview You may create a custom collection in order to reduce or combine the scopes of the original web archive collections before deriving ARCH datasets. This can be an especially useful pre-proc...

  • Karl Blumenthal created an article,

    Web Archive Named Entities (WANE) files

    Overview Web Archive Named Entities (WANE) files contain the named entities from each text resource in a web archive collection, organized by originating URL and timestamp. They enable researches t...

  • Karl Blumenthal created an article,

    Longitudinal Graph Analysis (LGA) files

    Overview Longitudinal Graph Analysis (LGA) files contain a complete list of what URLs link to what URLs, along with a timestamp, within an entire web archive collection. They are web graph files th...

  • Karl Blumenthal created an article,

    Web Archive Transformation (WAT) files

    Overview Web Archive Transformation (WAT) files enable a variety of methods of data analysis for studying web archives in aggregate and across time, including text mining, study of provenance and c...

  • Karl Blumenthal created an article,

    Tutorial: How to mine text from a web archive collection with Voyant

    <<< Back to the guide, "Sample ARCH datasets and how to explore them." Introduction Web archives can be read from the collection-level scale in order to surface the broader themes, topics, people, ...

  • Karl Blumenthal created an article,

    Tutorial: How to browse images from a web archive with Palladio

    <<< Back to the guide, "Sample ARCH datasets and how to explore them." Introduction Web archives contain myriad forms of expression beyond text. You can aggregate their media to enable access more ...