Karl

  • Total activity 38
  • Last activity
  • Member since
  • Following 0 users
  • Followed by 0 users
  • Votes 0
  • Subscriptions 17

Activity overview

Latest activity by Karl
  • Karl created an article,

    How to clean ARCH datasets

    Overview You can filter or combine the contents to create more specific ARCH datasets. Follow the instructions below to clean sample data with command line tools. You may adapt these instructions a...

  • Karl created an article,

    How to download and open ARCH datasets

    Overview You can download your ARCH datasets to work with them locally from the command line, desktop, or other web-based programs. Follow the instructions below to download each dataset from the I...

  • Karl created an article,

    Tutorial: Explore web archive data from the command line with Jupyter Notebooks

    <<< Back to the guide, "Sample ARCH datasets and how to explore them." Introduction Browser-based tools like those included in the above tutorials can help you to examine and visualize relatively s...

  • Karl created an article,

    Quick guide to using ARCH

    Introduction ARCH (Archives Research Compute Hub) is a research and education service that helps users build, access, and analyze digital collections computationally at scale. ARCH is configured cu...

  • Karl created an article,

    How to publish ARCH datasets to archive.org

    Overview You may publish any ARCH dataset as a publicly accessible item on archive.org in order to share and cite it. Follow the instructions below to find the publishing feature, add descriptive m...

  • Karl created an article,

    How to create a custom ARCH collection

    Overview You may create a custom collection in order to reduce or combine the scopes of the original web archive collections before deriving ARCH datasets. This can be an especially useful pre-proc...

  • Karl created an article,

    ARCH named entities datasets

    Overview ARCH named entities datasets contain the people, places, organizations, and dates from the text of a web archive collection, organized by originating URL and timestamp. They enable researc...

  • Karl created an article,

    Longitudinal Graph Analysis (LGA) files

    Overview Longitudinal Graph Analysis (LGA) files contain a complete list of what URLs link to what URLs, along with a timestamp, within an entire web archive collection. They are web graph files th...

  • Karl created an article,

    Web Archive Transformation (WAT) files

    Overview Web Archive Transformation (WAT) files enable a variety of methods of data analysis for studying web archives in aggregate and across time, including text mining, study of provenance and c...

  • Karl created an article,

    Tutorial: How to mine text from a web archive collection with Voyant

    <<< Back to the guide, "Sample ARCH datasets and how to explore them." Introduction Web archives can be read from the collection-level scale in order to surface the broader themes, topics, people, ...