Karl-Blumenthal

  • Total activity 41
  • Last activity
  • Member since
  • Following 0 users
  • Followed by 0 users
  • Votes 0
  • Subscriptions 19

Activity overview

Latest activity by Karl-Blumenthal
  • Karl Blumenthal created an article,

    Subscribe to ARCH updates

    Subscribe to updates from the ARCH team by entering your email address below. You will receive email messages about new ARCH releases, support resources, and opportunities to learn more at live eve...

  • Karl Blumenthal created an article,

    How to clean ARCH datasets

    Overview You can filter or combine the contents to create more specific ARCH datasets. Follow the instructions below to clean sample data with command line tools. You may adapt these instructions a...

  • Karl Blumenthal created an article,

    How to download and open ARCH datasets

    Overview You can download your ARCH datasets to work with them locally from the command line, desktop, or other web-based programs. Follow the instructions below to download each dataset from the I...

  • Karl Blumenthal created an article,

    Tutorial: Explore web archive data from the command line with Jupyter Notebooks

    <<< Back to the guide, "Sample ARCH datasets and how to explore them." Introduction Browser-based tools like those included in the above tutorials can help you to examine and visualize relatively s...

  • Karl Blumenthal created an article,

    Quick guide to using ARCH

    Introduction ARCH (Archives Research Compute Hub) is a research and education service that helps users build, access, and analyze digital collections computationally at scale. ARCH is configured cu...

  • Karl Blumenthal created an article,

    How to publish ARCH datasets to archive.org

    Overview You may publish any ARCH dataset as a publicly accessible item on archive.org in order to share and cite it. Follow the instructions below to find the publishing feature, add descriptive m...

  • Karl Blumenthal created an article,

    How to create a custom ARCH collection

    Overview You may create a custom collection in order to reduce or combine the scopes of the original web archive collections before deriving ARCH datasets. This can be an especially useful pre-proc...

  • Karl Blumenthal created an article,

    ARCH named entities datasets

    Overview ARCH named entities datasets contain the people, places, organizations, and dates from the text of a web archive collection, organized by originating URL and timestamp. They enable researc...

  • Karl Blumenthal created an article,

    Longitudinal Graph Analysis (LGA) files

    Overview Longitudinal Graph Analysis (LGA) files contain a complete list of what URLs link to what URLs, along with a timestamp, within an entire web archive collection. They are web graph files th...

  • Karl Blumenthal created an article,

    Web Archive Transformation (WAT) files

    Overview Web Archive Transformation (WAT) files enable a variety of methods of data analysis for studying web archives in aggregate and across time, including text mining, study of provenance and c...