Changing a CMOS Battery

Computers, computers, computers. I knew becoming a “technician” meant that I would learn more about computers and how they work, but I admit when I took an archival post I thought it might involve more film and analog video tech. Not that I don’t work with those things – in a couple weeks I will be demonstrating/troubleshooting film digitization workflows for telecine and scanners for our first-year basic media training/handling course. But I suppose the things that I have really expanded my knowledge of in the past year, and the things I am therefore excited and inclined to pass along to you, dear readers, have to do with the inner workings of computers and electronics. I have relied on them for so long, and really had little idea of all the intricate components that make them tick – and I’d like to quickly talk about another one of those pieces today.

zy5uyzcnkbvm2mz6heqi
No not that one

A few months back while dual-booting a Windows machine I wrote about the BIOS – the “Basic Input/Output System” that lives on the Central Processing Unit (CPU) or “motherboard” of every computer. As I explained then, this is a piece of firmware (“firmware” is a read-only application or piece of code that is baked-in to a piece of equipment/hardware and can not be easily installed/uninstalled – hence “firm” instead of “soft”) that maintains some super-basic functionality for your computer even if the operating system were to corrupt or fail.

Recently, the computer on our digital audio workstation started automatically booting into the BIOS every time it was turned on. You could easily work around this – just choosing to immediately exit the BIOS would immediately boot Windows 7 and all seemed to be functioning normally. But the system clock would reset every time (back to the much more innocent time of January, 2009). I wasn’t overly concerned, as any kind of actual failure with the operating system would’ve produced some kind of error message in the BIOS, and there was no such warning. But what was going on?

img_2691
Explain yourself!!!!

Doing a little digging on the internet, the giveaway was the system clock reset. The BIOS on a central processing unit is stored on a very tiny piece of solid-state memory called ROM (Read-Only Memory). It is very small because there are only a few pieces of data that need to actually be saved there every time you turn the computer on and off – just system settings like the clock, which take up a minuscule amount of space compared to video, audio, or even large text or application files. The trick is, ROM needs a constant source of power to actually save anything – even these tiny pieces of data. Cut off its power supply and it will completely reset/empty itself (much the same way your RAM – Random Access Memory – functions, but that’s another topic). This is what was happening – the BIOS was popping up every time we booted the computer because the BIOS had been reset and it just wanted us to set up our clock, fan speed, and other default system settings again (and again, and again, and again).

img_2692

 

img_2693
Oh how I wish it was 2009 again, computer.

How was the power being cut off? Under most circumstances, even leaving a computer plugged into a wall outlet is enough to maintain a flow of enough electricity to the CMOS (central metal-oxide semiconductor) circuit that powers the ROM/CPU/BIOS. Yes, as long as your device is still plugged into a wall outlet, there is electricity running to it, even if you have turned the device off. Which is why we keep the computer (and various other devices) on our digital audio workstation on a power strip that we periodically completely switch off – to conserve power/electricity.

img_2690
Yep we’re just the environment’s super best friend

But there is usually a failsafe to save the BIOS settings even under such circumstances: every modern CPU has a small coin battery to power the CMOS, which maintains an electric flow through the ROM and thereby saves your BIOS settings, even if the power supply unit on the back of your computer is completely disconnected. What happens when you completely unplug the computer AND that battery dies? Your BIOS will reset every time you boot up the computer – exactly what happened to us.

So, at the end of all that, easy fix: buy a new coin battery at the hardware store and replace the dead battery in the computer. Most modern CMOS/CPU batteries are type CR2032, but unfortunately manufacturers tend not to just list that information in their specs, so there’s no way to know for sure until you open up the computer itself and take a look. Hopefully the battery is in an easy-to-find spot that doesn’t require you to actually remove the CPU or any expansion cards/RAM/hard drives/whatever else you’ve got going on inside your computer.

img_2695
Zoom….
img_2696
….and enhance!! Found you!

Here are some basic instructions for replacing a CMOS battery if this happens to you – but it really is as easy as it sounds. The main thing to keep in mind, when working in the inside of any computer, is to take steps to ground yourself and avoid the buildup of static electricity – electro-static discharge can seriously damage the components of your computer, without you even noticing it (the threshold for static electricity to damage circuits is much lower than it is for you to feel those shocks). There are some easy tips here for grounding yourself – I just made sure I was wearing cotton clothes, was working on an anti-static (foam) surface, and occasionally touched a nearby metal surface as I worked.

I’ll be back soon with (I hope) a much longer piece running down my experience at the Association of Moving Image Archivists’ conference in Pittsburgh this month and how I found it very powerfully related to my work this year. Cheers!

Windows Subsystem for Linux – What’s the Deal?

This past summer, Microsoft released its “Anniversary Update” for Windows 10. It included a lot of the business-as-usual sort of operating system updates: enhanced security, improved integration with mobile devices, updates to Microsoft’s “virtual assistant” Cortana (who is totally not named after a video game AI character who went rampant and is currently trying to destroy all biological life in the known universe, because what company would possibly tempt fate like that?)

halo-4-cortana-rampant
“NO I WILL NOT OPEN ANOTHER INCOGNITO WINDOW FOR YOU, FILTH”

But possibly the biggest under-the-radar change to Windows 10 was the introduction of Bash on Ubuntu on Windows. Microsoft partnered with Canonical, the company that develops the popular Linux operating system distribution Ubuntu, to create a full-fledged Linux/Ubuntu subsystem (essentially Ubuntu 14.04 LTS) inside of Windows 10. That’s like a turducken of operating systems.

foo_ck_turducken_1223
Which layer is the NT kernel, though?

What does that mean, practically speaking? For years, if you were interested in command-line control of your Windows computer, you could use Powershell or the Command Prompt – the same basic command-line system that Microsoft has been using since the pre-Windows days of MS-DOS. Contrast that to Unix-based systems like Mac OSX and Ubuntu, which by default use an input system called the Bash shell – the thing you see any time you open the application Terminal.

 

The Bash shell is very popular with developers and programmers. Why? A variety of reasons. It’s an open-source system versus Microsoft’s proprietary interface, for one. It has some enhanced security features to keep users from completely breaking their operating system with an errant command (if you’re a novice command-line user, that’s why you use the “sudo” command sometimes in Terminal but never in Command Prompt – Windows just assumes everyone using Command Prompt is a “super user” with access to root directories, whereas Mac OSX/Linux prefers to at least check that you still remember your administrative password before deleting your hard drive from practical existence). The Bash scripting language handles batch processing (working with a whole bunch of files at once), scheduling commands to be executed at future times, and other automated tasks a little more intuitively. And, finally, Unix systems have a lot more built-in utility tools that make software development and navigating file systems more elegant (to be clear, these utility applications are not technically part of the Bash shell – they are built into the Mac OSX/Linux operating system itself and accessed via the Bash shell).

Bringing in a Linux subsystem and Bash shell to Windows is a pretty bold move to try and win back developers to Microsoft’s platform. There have been some attempts before to build Linux-like environments for Windows to port Mac/Linux software – Cygwin was probably the most notable – but no method I ever tried, at least, felt as intuitive to a Mac/Linux user as Bash on Ubuntu on Windows does.

cygwinsetup
what even are you

Considering the increasing attention on open-source software development and command-line implementation in the archival community, I was very curious as to whether Bash on Ubuntu on Windows could start bridging the divide between Mac and Windows systems in archives and libraries. The problem of incompatible software and the difference in command-line language between Terminal and Command Prompt isn’t insurmountable, but it’s not exactly convenient. What if we could get all users on the same page with the software they use AND how they use them – regardless of operating system???

OK. That’s still a pipe dream. I said earlier that the Windows Subsystem for Linux (yes that’s what it’s technically called even though that sounds like the exact opposite of what it should be) was “full-fledged” – buuuuuut I kinda lied. Microsoft intends the WSL to be a platform for software development, not implementation. You’re supposed to use it to build your applications, but not necessarily actually deploy it into a Windows-based workflow. To that end, there are some giant glaring holes from just a pure Ubuntu installation: using Bash on Ubuntu on Windows, you can’t deploy any Linux software with a graphical user interface (GUI) (for example, the common built-in Linux text editing utility program gedit doesn’t work – but nano, which allows you to text edit from within the Bash terminal window itself, does). It’s CLI or bust. Any web-based application is also a big no-no, so you’re not going to be able to sneakily run a Windows machine as a server using the Linux subsystem any time soon.

Edit: Oh and the other giant glaring thing I forgot to mention the first time around – there’s no external drive support yet. So the WSL can’t access removable media on USB or optical disc mounted on the Windows file system – only fixed drives. So disc imaging software, while it technically “works”, can only work with data already moved to your Windows system.

But with all those caveats in mind… who cares what Microsoft says is supposed to happen? What does it actually do? What works, and what doesn’t? I went through a laundry list of command-line tools that have been used or taught the past few years in our MIAP courses (primarily Video Preservation, Digital Preservation and Handling Complex Media), plus a few tools that I’ve personally found useful. First, I wanted to see if they installed at all – and if they did, I would try a couple of that program’s most basic commands, hardly anything in the way of flags or options. I wasn’t really trying to stress-test these applications, just see if they could indeed install and access the Windows file system in a manner familiar to Mac/Linux users.

bash
*Hello Bash my only friend / I’ve come to ‘cat’ with you again*

Before I start the run-down, a note on using Bash on Ubuntu on Windows yourself, if interested. Here are the instructions for installing and launching the Windows Subsystem for Linux – since the whole thing is technically still in beta, you’ll need to activate “developer” mode. Once installed and launched, ALL of these applications will only work through the Bash terminal window – you can not access the Linux subsystem, and all software installed thereon, from the traditional Windows Command Prompt. (It goes the other way too – you can’t activate your Windows applications from the Bash shell. This is all about accessing and working with the same files from your preferred command-line environment.) And once again, the actual Ubuntu version in this subsystem is 14.04 LTS – which is not the latest stable version of that operating system. So any software designed only to work with Ubuntu 16.04 or the very latest 16.10 isn’t going to work in the Windows subsystem.

Once you’re in a Bash terminal, you can access all your files stored within the Windows file system by navigating into the “/mnt/” directory:

[cc lang=”Bash”]$ cd /mnt/[/cc]

You should see different letters within this directory according to how many drives you have mounted in your computer, and their assigned letters/paths. For instance, for many Windows users all your files will probably be contained within something like:

[cc lang=”Bash”]/mnt/c/Users/your_user_name/Downloads[/cc] or

[cc lang=”Bash”]/mnt/c/Users/your_user_name/Desktop[/cc] , etc. etc.

And one last caveat: dragging and dropping a file into the Bash terminal to quickly get the full file path doesn’t work. It will give you a Command Prompt file path (e.g. “C:\Users\username\Downloads\file.pdf”) that the Bash shell can’t read. You’re going to have to manually type out the full file path yourself (tabbing over to automatically fill in directory/file names does still work, at least).

Let’s get to it!

Programs That Install Via Apt-Get:

  • bagit-java
  • bagit-python
  • cdrdao
  • ClamAV
  • ddrescue (install w/package name “gddrescue”, execute w/ command “ddrescue”)
  • ffmpeg (but NOT ffplay)
  • git
  • imagemagick
  • md5deep
  • mediainfo
  • MKVToolNix
  • Python/Python3/pip
  • Ruby/RubyGems
  • rsync
  • tree

Installing via Ubuntu’s “apt-get” utility is by far the easiest and most desirable method of getting applications installed on your Linux subsystem. It’s a package manager that works the same way as Homebrew on Mac, for those used to that system: just execute

[cc lang=”Bash”]$ sudo apt-get install nameofpackage [/cc]

and apt-get will install the desired program, including all necessary dependencies (any software libraries or other software utilities necessary to make the program run). As you can see, the WSL can handle a variety of useful applications: disk imaging (cdrdao, ddrescue), transcoders (ffmpeg, imagemagick), virus scanning (ClamAV), file system and metadata utilities (mediainfo, tree), hash/checksum generation (md5deep).

mediainfo
Windows 10!

You can also get distributions of programming languages like Python and Ruby and use their own package managers (pip, RubyGems) to install further packages/libraries/programs. I tried this out with Python by installing bagit-python (my preferred flavor of BagIt – see this previous post for difference between bagit-python and the bagit-java program you get by just running “apt-get bagit”), and with Ruby by installing the Nokogiri library and running through this little Ruby exercise by Ashley Blewer. (I’d tried it before on Mac OSX but guess what, works on Windows through the WSL too!)

A couple things to note: one, if you’re trying to install the Ubuntu version of ddrescue, there’s confusingly a couple different packages named the same thing, and serve the same purpose. There’s a nice little rundown of how that happened and how to make sure you’re installing and executing exactly the program you want on the Ubuntu forums.

Also, while ffmpeg’s transcoding and metadata-gathering features (ffprobe) work fine, its built-in media playback command (ffplay) will not, because of the aforementioned issue with GUIs (it has to do with X11, the window system that Unix systems use for graphical display, but never mind that for now). Actually, it sort of depends on how you define “work”, because while ffplay won’t properly play back video, it will generate some fucking awesome text art for you in the Bash terminal:

 

Programs That Require More Complicated Installation:

  • bulk_extractor (requires legacy JDK)
  • exiftool
  • Fslint
  • mediaconch
  • The Sleuth Kit tools

These applications can’t be installed via an apt-get package, but you can still get them running with a little extra work, thanks to other Linux features such as dpkg. Dpkg is another package management program – this one comes from Debian, a Linux operating system of which Ubuntu is a direct (more user-friendly) derivative. You can use dpkg to install Debian (.deb) packages (like the CLI of Mediaconch), although take note that unlike apt-get, dpkg does not automatically install dependencies – so you might need to go out and find other libraries/packages to install via apt-get or dpkg before your desired program actually starts working (for Mediaconch, for instance, you should just apt-get install Mediainfo first to make sure you have the libmediainfo library already in place).

The WSL does also have the autocompile and automake utilities of full Linux distributions, so you can also use those to get packages like The Sleuth Kit (a bunch of digital forensics tools) or Fslint (a duplicate file finder) running. Best solution is to follow whatever Linux installation documentation there is for each of these programs – if you have questions about troubleshooting specific programs, let me know and I’ll try to walk you through my process.

fslint

 

Programs That Don’t Work/Install…Yet:

  • Archivematica
  • Guymager
  • vrecord

I had no expectation that these programs would work given the stated GUI and web-based limitations of the WSL, but this is just to confirm that as far as I can tell, there’s no way to get them running. Guymager has the obvious GUI/X11 issue (plus the inability to recognize external devices, anyway, and the general dysfunction of the /dev/ directory). The vrecord team hasn’t successfully installed on Linux yet, and the WSL would run into the GUI issue even if they do release a Linux version. And web applications definitely aren’t my strong suit, but in the long process of attempting an Archivematica installation, the WSL seemed to have separate issues with Apache, uWSGI and NGINX. That’s a lot of troubleshooting to likely no end, so best to probably leave that one aside.

That’s about all for now – I’m curious if anyone else has been testing the WSL, or has any thoughts about its possible usefulness in bridging compatibility concerns. Is there any reason we shouldn’t just be teaching everyone Bash commands now??

Update (10/20): So the very day that I post this, Microsoft released a pretty major update to the WSL, with two major effects: 1) new installations of WSL will now be Ubuntu 16.04 (Xenial), though existing users such as myself will not automatically upgrade from 14.04; and 2) the Windows and Linux command-line interfaces now have cross-compatibility, so you can launch Windows applications from the Bash terminal and Linux applications from Command Prompt or Powershell. Combine that with the comment below from Euan with directions to actually launch Linux applications with GUIs, and there’s a whole slew of options to continue exploring here. Look for further posts in the future! This subsystem is clearly way more powerful than Microsoft initially wanted to let on.

Using Bagit

do you know how to use Bagit?
cause I’m lost
the libcongress github is sooo not user friendly

I’m missing bagger aren’t I
ugh ugh I just want to make a bag!

That’s an email I got a few weeks back from a good friend and former MIAP classmate. I wanted to share it because I feel like it sums up the attitude of a lot of archivists towards a tool that SHOULD be something of a godsend to the field and a foundational step in many digital preservation workflows – namely, the Library of Congress’ BagIt.

What is BagIt? It’s a software library, developed by the LoC in conjunction with their partners in the National Digital Information Infrastructure and Preservation Program (NDIIPP) to support the creation of “bags.” OK, so what’s a bag?

amerbeauty_165pyxurz

Let’s back up a minute. One of the big challenges in digital archiving is file fixity – a fancy term for checking that the contents of a file have not been changed or altered (that the file has remained “fixed”). There’s all sorts of reasons to regularly verify file fixity, even if a file has done nothing but sit on a computer or server or external hard drive: to make sure that a file hasn’t corrupted over time, that its metadata (file name, technical specs, etc.) hasn’t been accidentally changed by software or an operating system, etc.

But one of the biggest threats to file fixity is when you move a file – from a computer to a hard drive, or over a server, or even just from one folder/directory in your computer to another. Think of it kind of like putting something in the mail: there are a lot of points in the mailing process where a computer or USPS employee has to read the labeling and sort your mail into the proper bin or truck or plane so that it ends up getting to the correct destination. And there’s a LOT of opportunity for external forces to batter and jostle and otherwise get in your mail’s personal space. If you just slap a stamp on that beautiful glass vase you bought for your mother’s birthday and shove it in the mailbox, it’s not going to get to your mom in one piece.

screen-shot-2016-09-20-at-10-04-35-am
And what if you’re delivering something even more precious than a vase?

So a “bag” is a kind of special digital container – a way of packaging files together to make sure what we get on the receiving end of a transfer is the same thing that started the journey (like putting that nice glass vase in a heavily padded box with “fragile” stamped all over it).

Great, right? Generating a bag can take more time than people want, particularly if you’re an archive dealing with large, preservation-quality uncompressed video files, but it’s a no-brainer idea to implement into workflows for backing up/storing data. The thing is, as I said, the BagIt tools developed by the Library of Congress to support the creation of bags are software libraries – not necessarily in and of themselves fully-developed, ready-to-go applications. Developers have to put some kind of interface on top of the BagIt library for people to actually be able to actually interact and use it to create bags.

So right off the bat, even though tech-savvier archivists may constantly be recommending to others in the field to “use BagIt” to deliver or move their files, we’re already muddling the issue for new users, because there literally is no one, monolithic thing called “BagIt” that someone can just google and download and start running. And I think we seriously underestimate how much of a hindrance that is to widespread implementation. Basically anyone can understand the principles and need behind BagIt (I hopefully did a swift job of it in the paragraphs above) – but actually sifting through and installing the various BagIt distributions currently takes time, and an ability to read between the lines of some seriously scattered documentation.

So here I’m going to walk through the major BagIt implementations and explain a bit about how and why you might use each one. I hope consolidating all this information in one place will be more helpful than the Library of Congress’ github pages (which indeed make little effort to make their instructions accessible to anyone unfamiliar with developer-speak). If you want to learn more about the BagIt specification itself (i.e. what pieces/files actually make up a bag, how data gets hierarchically structured inside a bag, how BagIt checksums and tag manifests to do the file fixity work I mentioned earlier), I can recommend this introductory slideshow to BagIt from Justin Littman at the LoC.

Update (12/15/2017): While all the above info still stands, the roundup and installation instructions below are no longer 100% accurate. I’m keeping this post up for the sake of web archiving and laying the ever-changing state of digital preservation bare and all that, but if you’re here I’d now recommend that you proceed over to this post on using BagIt in 2018 for more up-to-date documentation!

1. Bagger (BagIt-java)

screen-shot-2016-09-20-at-9-27-12-am

The BagIt library was originally developed using a programming language called Java. For its first four stable versions, Bagit-java could be used either via command-line interface (a Terminal window on Macs or Linux/Ubuntu, Command Prompt in Windows), or via a Graphical User Interface (GUI) developed by the LoC itself called Bagger.

As of version 5 of Bagit-java, the LoC has completely ceased support of using BagIt-java via the command line. That doesn’t mean it isn’t still out there – if, for instance, you’re on a Mac and use the popular package manager Homebrew, typing

$ brew install bagit

will install the last, stable version (4.12.1) of BagIt-java. But damned if I know how to actually use it, because in its deprecation of support the LoC seems to have removed (or maybe just never wrote?) any online documentation (github or elsewhere) of how to use BagIt-java via command-line. No manual page for you.

Instead you now have to use Bagger to employ BagIt-java (from the LoC’s perspective, anyway). Bagger is primarily designed to run on Windows or, with some tinkering, Linux/Ubuntu and Mac OSX.

maxresdefault
They do, funnily enough, include a screed about the iPhone 7’s lack of a 3.5mm headphone jack.

So once you actually download Bagger, which I primarily recommend if you’re on Windows, there’s some pretty good existing documentation for using the application’s various features, and even without doing that reading, it’s a pretty intuitive program. Big honking buttons help you either start making a new bag (picking the A/V and other files you want to be included in the bag one-by-one and safely packaging them together into one directory) or create a “bag in place”, which just takes an already-existing folder/directory and structures the content within that folder according to the BagIt specification. You can also validate bags that have been given/sent to you (that is, check the fixity of the data). The “Is Bag Complete” feature checks whether a folder/directory you have is, in fact, officially a bag according to the rules of the BagIt spec.

(FWIW: I managed to get Bagger running on my OSX 10.10.5 desktop by installing an older version, 2.1.3, off of the Library of Congress’ Sourceforge. That download included a bagger.jar file that my Mac could easily open after installing the legacy Java 6 runtime environment (available here). But, that same Sourceforge page yells at you that the project has moved to Github, where you can only download the latest 2.7.1 release, which only includes the Windows-compatible bagger.bat file and material for compiling on Linux, no OSX-compatible .jar file. I have no idea what’s going on here, and we’ve definitely fallen into a tech-jargon zone that will scare off laypeople, so I’m going to leave it at “use Bagger with Windows”)

Update: After some initial confusion (see above paragraph), the documentation for running Bagger on OSX has improved twofold! First, the latest release of Bagger (2.7.2) came with some tweaks to the github repo’s documentation, including instructions for Linux/Ubuntu/OSX! Thanks, guys! Also check out the comments section on this page for some instructions for launching Bagger in OSX from the command-line and/or creating an AppleScript that will let you launch Bagger from Spotlight/the Applications menu like you would any other program.

2. Exactly

screen-shot-2016-09-20-at-9-29-04-am
Ed Begley knows exactly what I’m talking about

Developed by the consulting/developer agency AVPreserve with University of Kentucky Libraries, Exactly is another GUI built on top of Bagit-java. Unlike Bagger, it’s very easy to download, immediately install and run versions of Exactly for Mac or Windows, and AVPreserve provides a handy quickstart guide and a more detailed user manual, both very useful if you’re just getting started with bagging. Its features are at once more limited and more expansive than Bagger. The interface isn’t terribly verbose, meaning it’s not always clear what the application is actually doing from moment to moment. But Exactly is more robustly designed for insertion of extra metadata into the bag (users can create their own fields and values to be inserted in a bag’s bag-info.txt file, so you could include administrative metadata unique to your own institution).

And Exactly’s biggest attraction is that it actually serves as a file delivery client – that is, it won’t just package the files you want into a bag before transferring, but actual perform the transfer. So if you want to regularly move files to and from a dedicated server for storage with minimal fuss, Exactly might be the tool you want, albeit it could still use some aesthetic/verbosity design upgrades.

3. Command Line (BagIt-python)

screen-shot-2016-09-20-at-9-02-09-am

Let’s say you really prefer to work with command-line software. If you’re comfortable with CLI, there are many advantages – greater control over your bagging software and exactly how it operates. I mentioned earlier that the LoC stopped supporting BagIt-java for command-line, but that doesn’t mean you command-line junkies are completely out of luck. Instead they shifted support and development of command-line bagging to a different programming language: Python.

If you’re working in the command-line, chances are you’re on Mac OSX or maybe Linux. I’m going to assume from here on you’re on OSX, because anyone using Linux has likely figured all this out themselves (or would prefer to). And here’s the thing, if you’re a novice digital archivist working in the Terminal on OSX: you can’t install BagIt-python using Homebrew.

Instead, you’re going to need Python’s own package manager, a program called “pip.” In order to get Bagit-python, you’re going to need to do the following:

  1. Check what version of Python is on your computer. Mac OSX and Linux machines should come with Python already installed, but Bagit-python will require at least Python version 2.6 or above. You can check what version of Python you’re running in the terminal with:$ python ––version

    If your version isn’t high enough, visit https://www.python.org/downloads/ and download/install Python 2.7.12 for your operating system. (do not download a version of Python 3.x – Bagit-python will not work with Python 3, as if this wasn’t confusing enough)

  2. Now you’ll need the package manager/installer, pip. It may have come ready to go with your Python installation, or not. You can check that you have pip installed with:$ pip ––version

    If you get a version number, you’re ready to go to step 3. If you get a message that pip isn’t installed, you’ll have to visit https://pip.pypa.io/en/stable/installing/Click on the hyperlinked “get-pip.py”. A tab should open with a buncha text – just hit Command-S and you should have the option to save/download this page as a Python script (that is, as a .py file). Then, back in the Terminal, navigate into whatever directory you just downloaded that script into (Downloads perhaps, or Desktop, or wherever else), and run the script by invoking$ python get-pip.py

    Pip should now be installed.

     

  3. Once pip is in place you can just use it to install Bagit-python the same way you use Homebrew:$ pip install bagit               (sudo if necessary)

You should be all set now to start using Bagit-python via the command-line. You can invoke Bagit-python using commands that start with “bagit.py” – it’s a good idea to go over command line usage and options for adding metadata by visiting the help page first, which is one of the LoC’s better efforts at documentation: https://github.com/LibraryOfCongress/bagit-python or:

$ bagit.py –help

But the easiest usage is just to create a bag “in place” on a directory you already have, with a flag for whichever kind of checksum you want to use for fixity:

$ bagit.py –md5 /path/to/directory

As with Bagger, the directory will remain exactly where it is in your file system, but now will contain the various tag/checksum manifests and all the media files within a “data” folder according to the BagIt spec. Again, the power of command-line BagIt-python lies in its flexibility – the ability to add metadata, choose different checksum configurations, increase or decrease the verbosity of the software. If you’re comfortable with command-line tools, this is the implementation I would most recommend!

Update:  Please also read this terrific tutorial by Kathryn Gronsbell for the 2016 NDSR symposium for a more detailed rundown of BagIt-python use and installation, including sample exercises!!!

4. BaggerJS

screen-shot-2016-09-20-at-9-31-51-am

It’s still in an early/experimental phase, but the LoC is also working on a web-based application for bagging called BaggerJS (built on a version of the BagIt library written in JavaScript, which yes, for those wondering at home, is a totally different programming language than Java that works specifically for web applications, because we needed more versioning talk in this post).

Right now you can select and hash files (generate checksums for file fixity), and upload valid bags to a cloud server compatible with Amazon’s s3 protocol. Metadata entry and other features are still incomplete, but if development continues, this might, like Exactly, be a nice, simplified way to perform light bag transfers, particularly if you use a cloud storage service to back up files. It also has the advantage of not requiring any installation whatsoever, so novice users can more or less step around around the Java vs. Python, GUI vs. CLI questions.

https://libraryofcongress.github.io/bagger-js/

5. Integrated into other software platforms/programs

The BagIt library has also been integrated into larger, more complex software packages, designed for broader digital repository management. Creating bags is only one piece of what these platforms are designed to do. One good example is Archivematica, which can perform all kinds of file conformance checks, transcoding, automatic metadata generation and more. But it does package data according to the BagIt spec whenever it actually transfers files from one location to another.

And that’s the other, more complicated way to use the BagIt library – by building it into your own software and scripts! Obviously this is a more advanced step for archivists who interested in coding and software development. But the various BagIt versions (Java, Python, JavaScript) and the spec itself are all in the public domain and anyone could incorporate them into their own applications, or recreate the BagIt library in the programming language of their choice (there is, for instance, a BagIt-ruby version floating around out there, though it’s apparently deprecated and I’ve never heard of anyone who used it).