Reading MARC data

Reading MARC data – Reading MARC data with File_MARC

Overview

File_MARC allows you to read Machine Readable Cataloging (MARC) data in MARC 21 format. File_MARCXML, which is bundled with File_MARC, allows you to read MARCXML formatted data.

Reading MARC data from different sources

Your input source can be a PHP stream (File_MARC::SOURCE_FILE) or a string (File_MARC::SOURCE_STRING). The source location and source types are the first and second arguments to the File_MARC and File_MARCXML constructors.

Reading MARC 21 data from a file

In the following example, MARC21 data has been stored on disk in a file named journals.mrc. To read the MARCX21 data, we call the constructor for File_MARC. We do not need to tell File_MARC that journals.mrc is a filename or stream pointing to the MARC content, because the value of the second parameter of the constructor for File_MARC defaults to File_MARC::SOURCE_FILE.

<?php
require 'File/MARC.php';

// Retrieve a set of MARC records from a file
$journals = new File_MARC('journals.mrc');

// Iterate through the retrieved records
while ($record $journals->next()) {
    
// Pretty print each record
    
print $record;
    print 
"\n";
}
?>

Reading MARCXML data from a string

In the following example, MARCXML data has been returned from a call to a Web service and has therefore been stored in a PHP variable, $xml_data, as a string. To read the MARCXML data, we call the constructor for File_MARCXML. To tell File_MARCXML that $xml_data is a string, we specify File_MARC::SOURCE_STRING as the second parameter of the constructor.

<?php
require 'File/MARCXML.php';

// Retrieve a set of MARCXML records from a string
$journals = new File_MARCXML($xml_dataFile_MARC::SOURCE_STRING);

// Iterate through the retrieved records
while ($record $journals->next()) {
    
// Pretty print each record
    
print $record;
    print 
"\n";
}
?>

Reading MARC data from different sources

A File_MARC object consists of a leader and an iterable set of File_MARC_Record objects representing MARC records. Each of these, in turn, consists of an iterable set of File_MARC_Data_Field or File_MARC_Control_Field objects representing MARC fields. A File_MARC_Data_Field consists of a set of iterable File_MARC_Subfield objects representing MARC subfields.

Printing the elements of a record

<?php
require 'File/MARC.php';

// Retrieve a set of MARC records
$bibrecords = new File_MARC('catdump.mrc'File_MARC::SOURCE_FILE);

// Iterate through the retrieved records
while ($record $bibrecords->next()) {
    
// Print the leader
    
print $record->getLeader();
    
$subjects $record->getFields('650');
    if (
$subjects) {
        
// Retrieve just the first 24_ field
        
print $record->getField('24.'true);
        print 
"\n";

        
// Now print all of the retrieved subjects
        
foreach ($subjects as $field) {
            print 
$field;
            print 
"\n";
        } 
        print 
"\n";
    }
}
?>

All of this means that File_MARC makes it easy to read in a set of MARC records and iterate through the contents to retrieve specific fields and subfields. File_MARC offers convenience methods for retrieving specific fields without forcing you to iterate through the fields. getField returns the first field that matches the field name, while getFields returns an array of all of the fields that match the specified field name. Both of these methods accept an optional boolean parameter that specifies whether your match string should be treated as a Perl Compatible Regular Expression.

Retrieving all 650 fields from a record

<?php
require 'File/MARC.php';

// Retrieve a set of MARC records from a z39 result string
$bibrecords = new File_MARC($z39_resultFile_MARC::SOURCE_STRING);

// Iterate through the retrieved records
while ($record $bibrecords->next()) {
    
// Retrieve an array of all of the 650 fields
    
$subjects $record->getFields('650');
    if (
$subjects) {
        
// Retrieve just the first 24_ field
        
print $record->getField('24.'true);
        print 
"\n";

        
// Now print all of the retrieved subjects
        
foreach ($subjects as $field) {
            print 
$field;
            print 
"\n";
        } 
        print 
"\n";
    }
}
?>

Iterating through fields and subfields

When you iterate over a File_MARC_Data_Field object using foreach(), the MARC tag for the given field is returned as the key for the element and the set of subfields is returned as the value of the element.

Similarly, when you iterate over a File_MARC_Subfield object using foreach(), the code for the given subfield is returned as the key of the element and the value of the given subfield is returned as the value of the element.

Iterating over fields and subfields in a MARC record

In the following example, we iterate through a set of 650 fields to print out the subject headings contained in the subfields for each field.

<?php
require 'File/MARC.php';

// Retrieve a set of MARC records from a z39 result string
$bibrecords = new File_MARC($z39_resultFile_MARC::SOURCE_STRING);

// Go through each record
while ($record $bibrecords->next()) {
    
// Iterate through the fields
    
foreach ($record->getFields() as $tag => $subfields) {
        
// Skip everything except for 650 fields
        
if ($tag == '650') {
            print 
"Subject:";
            foreach (
$subfields->getSubfields() as $code => $value) {
                print 
$value";
            }
            print 
"\n";
        }
    }
}
?>

Retrieving field indicators

Data fields, represented by the File_MARC_Data_Field class, offer a getIndicator() function to enable you to retrieve the value of an indicator.

Retrieving an indicator

In the following example, we retrieve the title (245) field of a MARC record and check the second indicator for the field to determine whether there are non-filing indicators that we should ignore when sorting the contents of the field.

<?php
require 'File/MARC.php';

$titleField $record->getField('245');
$nonfiling $titleField->getIndicator(2);

if (
$nonfiling) {
  
// Sort using the subset of the $a subfield
  
$title substr($titleField->getSubfield('a'), $nonfiling);
} else {
  
// Sort using the entire contents of the $a subfield
  
$title $titleField->getSubfield('a');
}
?>