Introduction to Library and Information Science/Information Organization

After reading this chapter, a student should be able to articulate:
 * 1) how to build an effective bibliography
 * 2) how libraries share catalog records
 * 3) the purpose and structure of MARC records
 * 4) the FRBR conceptual model
 * 5) the concepts that inform the field of Information Architecture
 * 6) the strengths of several major classification strategies
 * 7) the concepts of the semantic web and linked data
 * 8) major critiques of information organization practice

Why organize information?
The sheer abundance of information available on the Internet leads to limited user attention and a high reliance on gate-keeping services, such as search engines. These gate-keeping services capitalize on user attention scarcity by channeling users' attention toward certain documents and away from others.

Bibliographic metadata
Cataloging is the process of adding an item to a catalog, a process typically including bibliographic description, subject analysis, and classification.

MARC formats
The MARC formats are digital formats for the description of bibliographic items developed by the US Library of Congress during the 1960s to facilitate the creation and dissemination of cataloging between libraries. While the formats were originally created to facilitate printing of paper catalog cards, they are still in use today as the basis for most computerized library catalogs.

Jackie Radebaugh argues that the MARC format will survive in an era of global digital communication. She describes the modifications made to MARC to accommodate different types of materials, to make web addresses accessible from MARC, and to access an online table of contents from the MARC record. MARC 21 is used in many countries of the world today. MARC documentation has been translated into several languages. Current modifications include mapping MARC to a variety of languages together with the Dublin Core, SGML, and XML to adapt MARC to web-based environments. After it has been modified for web use will it still be MARC? Radebaugh quotes one presenter at the 2000 ALA conference who stated that MARC is very much alive. At the same conference Fred Kilgour, the man who championed MARC and is responsible for much of its success, speculated that in the next 30 years Marc will be replaced. In my opinion MARC has been a resilient format, but may be superseded by formats more adaptive to digital environments.

Bibliographic framework (BIBFRAME)
Several expert groups, including the Working Group on the Future of Bibliographic Control and the U.S. RDA Test Coordinating Committee have recommended that the library community implement a new carrier for bibliographic data that replaces the MARC standards. The RDA Test Committee's Final Report suggested that new ways of cataloging are unlikely to yield significant benefits unless they are implemented on top of a new means for capturing and sharing bibliographic data. In other words, many experts believe that the MARC format is holding back the development of better cataloging practices.

The general plan for the new Bibliographic Framework enumerates a list of requirements for the new carrier, which is described as an “environment” rather than a simple “format”. Some of the most noteworthy requirements are that the new environment will support bibliographic description expressed as both textual data and linked data URIs; accommodate RDA, AACR2, DACS, VRA Core, and CCO descriptive rules; and provision for data that support or accompany bibliographic data, such as authority data, holdings data, preservation data, and technical data. The plan also notes that catalogers are likely to interact with the new data carrier on a more abstract level than they currently interact with MARC. BIBFRAME's approach is based on the RDF data model from the Linked Data community, for which a number of query and storage tools have already been developed.

Functional Requirements for Bibliographic Records


Historically, cataloging practice has been devoted to describing "books". Early attempts to standardize cataloging practice internationally, such as the 1961 Paris Principles, only addressed the cataloging of printed books. The Paris Principles describe themselves as only being applicable to "catalogues of printed books in which entries under authors' names and, where these are inappropriate or insufficient, under the titles of works are combined in one alphabetical sequence." Even in 1961, libraries offered their patrons materials other than print books, such as periodicals and musical recordings, and the number of formats that library catalogers have been called on to describe has only grown since that time. Contemporary library catalogs may include reference databases, e-books, DVDs, computer software, websites, blogs, audiobooks, digitized archival materials, and countless other resources.

As the number of formats described in library catalogs has grown, the meaning of the term book has become less clear. As Barbara Tillett notes:

When we say the word book in everyday language, we may actually mean several things. For example, when we say book to describe a physical object that has paper pages and a binding and can sometimes be used to prop open a door or hold up a table leg, FRBR calls this an "item." When we say book we also may mean a "publication" as when we go to a bookstore to purchase a book. We may know its ISBN but the particular copy does not matter as long as it's in good condition and not missing pages. FRBR calls this a "manifestation." When we say book as in "who translated that book," we may have a particular text in mind and a specific language. FRBR calls this an "expression." When we say book as in "who wrote that book," we could mean a higher level of abstraction, the conceptual content that underlies all of the linguistic versions, the story being told in the book, the ideas in a person's head for the book. FRBR calls this a "work."

Because of these issues, the cataloging community felt that it was necessary to have a new conceptual model for cataloging that didn't center around the ambiguous book. The Functional Requirements for Bibliographic Records, or FRBR, was an attempt to clarify this hazy terminology, and to provide a model that was independent of particular cataloging codes and material formats.

FRBR starts with four "Group One Entities": the Work, the Expression, the Manifestation, and the Item. A traditional catalog record combines description at each of these levels, but generally centers around a description at the manifestation level. After defining these Group One Entities, FRBR then continues to define relationships between these entities and each other, as well as with other entities, such as authors, publishers, and other people and corporate bodies (Group Two entities), and topics (Group Three entities).

Like the earlier Paris Principles, FRBR is separate from specific cataloging standards such as AACR2 or International Standard Bibliographic Description (ISBD). However, the FRBR model has been used to inform new cataloging standards, such as Resource Description and Access (RDA), as well as changes in automated systems and user interfaces.

Classification
To make a Classification system, you need four things:
 * classes, or ways to group objects based on similar characteristics
 * labels for the classes, or ways of expressing these conceptual classes in human language
 * notation, another way to express these classes. Notation helps with automated processes, as well as making things easier for humans.  For example, Dewey Decimal Classification system notates the subject of "Jewish people in Ukraine" as 305.89240477.  This is much easier to shelve and retrieve than the non-notated version, shown below.
 * relationships between classes (these are usually taxonomic, i.e. heirarchical). For example in the example above, Europe stands in a hierarchical relationship with Russia and Eastern Europe

Dewey Decimal Classification
The DDC is in its 23rd edition and is the "world's most widely used classification system." It can be ordered in a four-volume print version, as full WebDewey or an abridged print or web version which is better for smaller collections. Membership includes updates on its web versions quarterly, and a semiannual DDC newsletter, offers to conferences and workshops, OCLC articles and case studies. This website is very simple to use. It is not too complicated and gets to the point if your library has a need for it. It doesn’t have a lot of “bells and whistles” but has what is necessary and is simple to follow. As it pertains to the chapter the information this site provides is a basic source to use when learning about DDC. The Dewey Decimal system is such a huge part of so many libraries it is hard to think of not having such a well structure organizational tool to use.

Information architecture
Information Architecture is a field that started in the 1990’s with the high-tech boom in full force. They are similar to a building architect except they do their designing for a website. An IA (Information Architect) makes up the logical structure of a website. They look at the needs of the users and design the visual and interaction design according to the user experience, making it easier to find information and to work around a site. This also will make it easier to manage the site.

The key concepts that an IA looks at are:
 * organization
 * navigation
 * search
 * labeling
 * controlling

The IA then draws up blueprints and works closely with the technical, graphic, and editorial team members to finish the site. IA Chris Farnum is very knowledgeable and informative in their profession and goes on to explain that there are several ways to find out more about his field through books, seminars and college courses. I like the way that the information has been presented very clearly and explained in detail, and I see a strong need for this type of profession in today’s world of technology. With all of the options and choices of sites out in the World Wide Web, in this day and age, I can see a large need to make your site the most marketable and user-friendly as possible.

Information retrieval
Artificial intelligence has two main applications in information retrieval: organization of application methods, and the design of classification methods. There is no shared terminology between the fields, making it difficult for the two areas to collaborate initially. Linda C. Smith, in her 1976 article "Artificial Intelligence and Information Retrieval," predicts that as artificial intelligence and information retrieval continue to expand there will still need to be an increase in the cognitive ability of the users to discern what has been retrieved from the original search. The other concern for users was the anticipation that in order to use the system, a user would need to be experts to get the desired results. At the time of the article, there was a growing interest in the ability of these retrieval systems to answer questions and retrieve facts, both items we see have come to fruition today in modern search engines used every day. Artificial intelligence was seen to have both short term and long term effects on information retrieval. In the short term, it would modify the results of a current search during a query to meet the user’s current needs. In the long term, it would modify the document representations to improve responses. This article was a follow up to the author’s initial research in 1980. Little had changed in that time as far as attitudes and outlook for the feasibility of using artificial intelligence techniques as practical applications in library science and information retrieval. With hindsight on my side, it is interesting to see that common search engines use the kind of aided searches to find related topics that the author thought only the experts would be able to complete.

The World Wide Web (WWW) is common in school libraries because of its value as an educational tool. Previous research indicates that domain expertise improves online search performance. Other research shows that WWW browsing experience does not play a significant role in achieving a higher efficiency or accuracy level in a search. The authors' study tests fourth-graders' online search performance in relation to how proficient they were in WWW use; the level of domain expertise (Dutch literature) was consistent. The results showed that WWW experts were better than novices at locating Web sites but that WWW experience does not substantially affect how well information is located on a specific Web site. The authors argue that locating Web sites involves more use of search engines, a skill in which the experts are more proficient, but finding information on a Web site generally involves browsing, in which WWW experience is not as important. Although the novices were not true beginners and the experts not professionals, the novices would greatly benefit from courses teaching search skills, such as how to use search engines and Boolean operators. I think that instruction in determining the relevance and validity of search results is just as important as search skills, but the article does not address this. However, judging relevance depends on the user's knowledge of a subject that school librarians are not often responsible for teaching. A good relationship among teachers, school librarians, and the WWW must emerge for students to receive the best possible education.

Information science has at its core the concept of “Relevance”, which is a subjective notion. Relevance is defined broadly as a measurement of the effectiveness of an exchange or contact of information between a source and a user, all based on the differing views of what constitutes effective communication. This concept is the basis for how entire information retrieval systems are designed and utilized. There are many differing views on what this means, but all of them are somewhat related and interconnected regardless of how they are defined or utilized. The author describes in great detail, the framework for these differing views and the underlying philosophies behind each concept of what constitutes effective communication. He argues that all of these differing constructs are incomplete, (yet correct); depending on where one starts their examination of the communication process. He ends his paper with an appropriate call for more study on the subject. His paper is a recap of the opposing arguments of three decades ago, but in fact is more important now than ever before, as new information systems come on line and into being (the Internet, the electronic database, funding for collections, etc.), and are all based on an incomplete definition of “effective communication”.

The Semantic Web, RDF, and linked data


Eric Miller and Ralph Swick describe the Semantic Web as "an extension of the current Web in which the meaning of information is clearly and explicitly linked from the information itself, better enabling computers and people to work in cooperation."

The Semantic Web is a stack of technologies that seeks to convert the current Web, which is full of unstructured and semi-structured documents, into a "web of data" where documents are all available in machine-readable formats as well as human-readable ones. Semantic Web enthusiasts argue that this abundance of machine-readable formats will allow both human users and automated technologies to find, share, and combine information more easily.

The Semantic Web fosters and encourages greater data reuse by making it available for purposes not planned or conceived by the data provider. Suppose you want, for example, to locate news articles published in the previous month about companies headquartered in cities with populations under 500,000 or to compare the stock price of a company with the weather at its home base or to search online product catalogs for an equivalent replacement part for something. The information may be there in the Web, but currently only in a form that requires intensive human processing.

A key resource for current work with RDF is DBpedia, an effort to extract RDF data from the Wikipedia project. An example of a typical DBpedia record can be seen at http://dbpedia.org/page/Audre_Lorde, a human-language description of Audre Lord, a "black lesbian feminist mother poet warrior," who also worked as a librarian. This exact same information is available in a machine readable format at the URI http://dbpedia.org/data/Audre_Lorde. For every URI for humans in the format http://dbpedia.org/page/Topic, there is a URI for machines in the format http://dbpedia.org/data/Topic, which expresses the exact same data. Notice also that all of the properties and many of their values are links that you can click on. Most of these links are also available in machine-readable formats, which means that a machine could follow these links repeatedly to integrate information about Audre Lorde from numerous sources.

Miller and Hillmann, in order to make sense of the semantic (contextual) web, describe the makeup of the web: semantics, structure, and syntax. Semantics refers to the context of information and its meaning. The structure encompasses how the information is organized, and the syntax is how the semantics and structure are communicated. EXtensible Markup Language (XML) deals with the syntax, and Resource Description Format (RDF) is what enables the structure. Libraries are best equipped to embrace XML and RDF to address cataloging and web-based interfaces to share information. It is their responsibility to utilize traditional models (MARC) and current developments (XML and RDF) to control their information and provide usable interfaces for patrons. The article is succinct and straightforward, and the authors' attempts to decode these swirling acronyms should be commended. However, from a layperson's perspective, these are still difficult concepts to wrap one’s head around. Miller and Hillmann are concerned that libraries’ focus is too narrow to meet the needs of patrons. Now that the semantic web has greater possibilities, libraries should include items usually forgotten such as older journals or sound files. This is very postmodern indeed. As exciting as this is (to see the inclusion of items generally left out of a library’s domain), I wonder at this feasibility in a public library. How prepared are public librarians to learn these new languages? Patrons are ready to see their public library as a resource not just for books and Internet access, but are libraries ready to deliver?

Knowledge management
Knowledge Management is an outgrowth of data and information management. Increased competition and a mobile workforce with analytical skills and reduced loyalty to the organization make the concept of knowledge management and retention, appealing to managers.

Knowledge Management (KM), once the sole domain of the corporate world, is now necessary to many information disciplines. Information professionals need an understanding of what makes KM effective. This understanding begins with knowledge as an ever-changing entity, made up of human experiences, emotions, and senses. Scholars have identified knowledge to include the tacit (internal) and the explicit (manifestations of the internal). Organizations have developed KM systems to facilitate both tacit and explicit knowledge, and the sharing of the two, through discussion, training, team-building, etc. The author contends that (despite critics’ assessment that KM is incapable of a process so subjective and human) examples abound from the corporate world where KM has maximized profitability, innovation, etc. Based on these examples, the author stresses principles to guide effective KM. These include an environment dedicated to valuing individual experience and open communication as a means to share and learn; using technology to accommodate knowledge (emphasizing currency, accessibility, etc.); and an understanding that KM requires a rooting in both the tacit and explicit forms of knowledge. The fluidity of knowledge is well-developed. However, how this evolving knowledge affects KM is illustrated by examples solely from the realm of corporate organizations. If KM is increasing in scope beyond the traditional organizations, there is a surprising lack of even anecdotal evidence in new, information-based environments. The implications given for information professionals are a rehashing of the models that have served corporations. It is unclear how KM, armed with a sense of the dynamic and human nature of knowledge, will look different in an information-based environment. This article would be better served if it distinguished how effective KM would look, feel, and operate under the expanding information professions versus the traditional corporate model.

David Blair contends that early attempts to manage knowledge failed because they attempted to improve or replace human decision making. Knowledge management does not replace human decision making but facilitates it. Experts are part of the system. The author persuasively argues that data and information management offer limited returns to an organization and that artificial intelligence projects and systems without human experts have not proven successful. The author contends that knowledge management requires “communities of practice,” a culture in which experts share knowledge with novices. Additionally, a wide variety of informative media must be available through technology to support Knowledge Management. Knowledge workers require strong critical thinking skills and the ability to find and evaluate information from a variety of sources.

The greatest challenges to implementing knowledge management are creating an organizational culture that facilitates the sharing of knowledge, the treatment of tacit knowledge and the legal issues concerning the nature of intellectual property. The author contends that though tacit knowledge may be inexpressible, rules-of-thumb and "best practices", case studies of problems and methods of resolution can provide experts with the knowledge to make informed decisions.

Some aspects of Knowledge Management have already been implemented. The Knowledge management is one of many emerging information technologies that organizations employ to remain competitive.

The politics of information organization
Sanford Berman's Prejudices and Antipathies

Hope Olson's The Power to Name