Using Library Data Files

IMLS provides data files for the Library Statistics Program. These data files are available for downloading from the IMLS website.

These pages give you information about finding and using the data files.

Why download data files?

Advantages of Using Data Files

Researchers use data files to perform customized data analysis not available in the web tools and publications. For example, publications and web tools may not make available an analysis using the particular variables the researcher needs.

The "Search For Public Libraries" tool does not allow users to export or download data. The "Compare Public Libraries" tool provides export or download capabilities.

In the web tools and publications, ratios (e.g., per capita) are calculated values. However, these calculations may have been done differently than the researcher requires, perhaps by using a different formula. On the other hand, calculations, aggregations, etc. are under the control of the researcher, if he or she downloads and works with the data files directly.

Data files contain some data fields not used in the web tools, and therefore not included in a file downloaded or exported from the web tools. These include:

Disadvantages of Using Data Files

Users must download the entire data file, then sort records as desired and delete records not wanted. On the other hand, the web tools allow users to download just those records they may be interested in.

Fields containing calculated values for the web tools aren't included in the data files. However, we have provided information on how these fields are calculated below, so the user can perform his/her own calculations.

Top

Documentation files

What's in them?

The record layouts (usually in one or more Appendices) give you:

Top

File formats

Since different applications require files in different formats, data files are often available in several formats.

Files are often available in these formats: MSAccess, ASCII, or SAS formats (not all data files are available in SAS format). A number of older files are available only on diskette or magnetic tape.

The IMLS Electronic Catalog data page for each data file will give you specific information on what format(s) the data file is available in, and how to obtain the data file.

Many data files are Zipped using WinZIP, for faster downloads.

Most documentation files are in the Adobe .PDF format.

Using MS Access format files

Data files that are in MS Access format (with .MDB extension) can be used directly in Microsoft's Access database application, and any application that can import or read MS Access database files.

Using ASCII format files

ASCII-format files (with .TXT extension) can be viewed and edited using any text editor (MS WordPad, TextPad, etc), and imported into many software applications:

Some tips for using ASCII-format files:

Top

Public-Use Files

The data files available on the IMLS Web site are public-use data files. These files have had some data removed to protect the confidentiality of individually identifiable survey respondents.

Public-use data files are publicly available without restriction, and do not require a license. Survey data are coded or aggregated without individually identifiable information. Data that could be directly identified with one individual (salaries and wages for librarians for a library with one librarian, for example) are removed.

The library web tools use the public-use data files; that is, some of the data used by the tools have been removed as described above.

A small proportion of the data that has been collected are in restricted-use data files, which contain individually identifiable information, which is confidential and protected by law. For more information on restricted-use data files, please contact IMLS Staff.

Top

Differences between public- and restricted-use files in the Public Libraries Survey

From the PLS documentation titled "Data File, Public-Use: Public Libraries Survey: Fiscal Year 2001" PDF File (1,036 KB):

Public-use data. On the public-use Public Library Data File, selected expenditures data (i.e., Salaries, Benefits, Total Staff Expenditures, and Other Operating Expenditures) for public libraries have been removed (i.e., the field is blank) when total full-time equivalent (FTE) staff is less than or equal to 2.00, to protect the confidentiality of respondents. These data may also be suppressed for other libraries to ensure that all states that have suppressed data have a minimum of 3 suppressed records. The library's Total Operating Expenditures are not affected by the suppression of these data. No data are suppressed on the public-use State Summary/State Characteristics Data File or the Public Library Outlet Data File.

Restricted-use data. No data are suppressed on the restricted-use Public Library Data File. The inclusion of all expenditures data irrespective of the number of employees enables the identification of individual salary data at some libraries.

Top

Data Notes

Calculated fields in the web tools

Several fields used by the web tools are calculated from other fields in the data files. These calculated fields are not included in the data files downloaded directly from the IMLS website. Calculated fields include per-capita and per-1,000-enrolled values, percent-of-total values, etc.

Click below for:

Top

Population fields - Differences between "Population of Legal Service Area" and "Unduplicated Population" for Public Libraries

From the PLS documentation titled "Data File, Public-Use: Public Libraries Survey: Fiscal Year 2001" PDF File (1,036 KB):

Survey Population Items

The PLS has three population items: (1) Population of Legal Service Area (reported for each public library by the state library agency), (2) Total Unduplicated Population of Legal Service Areas (a single figure, reported by the state library agency), and (3) Official State Total Population Estimate (reported by the state library agency). The total Population of Legal Service Area for all public libraries in a state may exceed the state's Total Unduplicated Population of Legal Service Areas or the Official State Total Population Estimate. This occurs when the state has one or more geographically adjacent libraries (for example, a county library and a city library within the county) that serve, and therefore count, the same population. Twenty-six states had such overlapping service areas in FY 2001.

In order to do meaningful analysis using Population of Legal Service Area data (for example, the number of books/serial volumes per capita), the data were adjusted to eliminate duplicative reporting in states with overlapping service areas. The Public Library Data File has a derived unduplicated population of legal service area for each library for this purpose, called POPU_UND. This value was prorated for each library by calculating the ratio of a library's Population of Legal Service Area to the total Population of Legal Service Area for all libraries in the state, and applying the ratio to the state's Total Unduplicated Population of Legal Service Areas. (The latter item is a single, state-reported figure. It is on the State Summary/State Characteristics Data File and is also called POPU_UND.)

Top

Imputation

Imputation is a statistical means for providing a valid value for missing data. Note that data files, both public- and restricted-use, have had imputation applied, but the data used by the Library Statistics Program web tools have not.

Imputation in the Public Libraries Survey

From the PLS documentation titled "Data File, Public-Use: Public Libraries Survey: Fiscal Year 2001" PDF File (1,036 KB):

All libraries, including nonresponding libraries, were sorted into imputation cells based on the region and size of population served. Item imputation was performed on each record with nonresponse variables. The data are identified as either imputed (estimated) or reported (actual) on the survey data file, through the use of imputation codes.

Imputation in State Library Agencies Survey