The Incredible, Embeddable Web

Originally published in Online 31(5): 43-47, Sept-Oct 2007.

A Web version of an On the Net column from Online
by Greg R. Notess

Sharing content is popular these days, thanks to XML, RSS, APIs, and other acronyms for data interchange. Beyond the programmers and other acronym experts creating sophisticated ways to exchange content automatically, embedding has become a popular way for bloggers, social network participants, and others to easily share content. You can embed videos, presentations, documents, images, spreadsheets, and lists on other pages by simply copying and pasting a bit of the code that is given by these embeddable content sites.

The rise of video sharing and social networking sites have helped to drive the popularity of the sharing option and to refine and simplify the process of embedding content on other sites. Take a look at almost any video-sharing service and scan for an "embed" or "share" option. Typically, these give a standard link to the video as well as the exact code needed for embedding the video in a blog, on MySpace, or on any other Web page. YouTube has an embed link just to the right of a video with its code. blip.tv has a link under the videos for "share" that opens up a variety of embedding options. Note that while these embed options show some complex-looking code, all that a user needs to do is copy that code and then paste it within a blog post or on another Web page. blip.tv gives customized code for general blogs, MySpace, and WordPress.

blip.tv embed screen
Blip.tv video sharing and embedding options

While video has been a driving force in this simplified copy-and-paste approach to embedding, many other Web 2.0-style applications have adopted it. Documents, presentations, spreadsheets, and other types of textual, numeric, and multimedia content can now be embedded on other sites. Beyond just the opportunities for bloggers and Web site owners, this growing embedding movement impacts information professionals. With content potentially coming from a diverse group of sources, can it always be determined where the information originated? How does this impact evaluation when the source may be unknown? Does the embedding approach allow the textual content to be indexed and searched? Anyone exploring Web documents, tracking intellectual property, or seeking to cite an original source faces new challenges with the embeddable Web.

EMBEDDING PRESENTATIONS

A variety of choices for sharing and embedding presentations now exist. Several sites offer an easy and free option for uploading a presentation that can then be embedded as well as commented on and rated. Of these sites, SlideShare is probably the best known, but authorSTREAM, SlideBurner, and Zoho Show offer similar services. For the presentation author, the process is as simple as uploading a video to a video-sharing site. It goes like this:

  1. Sign up for an account
  2. Log in
  3. Upload a presentation file
  4. Wait a few minutes for processing
  5. Then share, link to, or embed the file

Once the file is live, the presentation-sharing sites give code for embedding the presentation. Compared to downloading a large .ppt file and then viewing it on your own computer (assuming that you have PowerPoint installed on that computer), it is much easier to view these shared presentations online. Served up in a Flash presentation, an embedded presentation has buttons below the slides for moving forward or backward. For usability, instead of loading a new page each time, the slides appear within the Flash application after clicking the button.

As with YouTube, the most popular presentations shared on these sites are frequently on popular entertainment topics, jokes, or slide shows of photos. It took several years before PowerPoint became the lingua franca of conference presentations and a few more before conference Web sites started linking to the .ppt files of sessions. While I do not expect a sudden shift, where the majority of conference presentations appear on a sharing site or embedded on the conference site itself, the possibility is enticing.

Embed presentation
My presentation from Computers in Libraries 2007 as a Zoho Show

EMBEDDING DOCUMENTS

Textual content has been the bulwark of Web page information from the beginning. So what need is there to embed text on a Web page? The formatting within published documents, especially those originally intended for print, is not always easy to reproduce within HTML for a Web page. Like PowerPoint, Adobe's PDF is the predominant choice for retaining a document's format on the Web. Scribd has the potential to change that.

Scribd bills itself as intending to "create the world's largest open library of documents." Upload a Word, PDF, text, Power-Point, Excel, rich text format, OpenOffice.org, PostScript, or Microsoft Reader (.lit) file, and Scribd makes it available as an embeddable document. This document, if made public when it was uploaded, shows link and embedding code that can be used to easily embed the entire document on another site. The embedded ScribdPaper format uses Flash and allows viewing, scrolling, and printing--all from within the document's window. View a Scribd document full screen (link near the upper-right corner) to see more of the document.

Scribd also converts uploaded documents and makes them available in the original format along with PDF, Word, text, and an automatically rendered audio format. The person doing the uploading determines whether to share the file and can limit which formats will be made available. Keep a file private if you do not want it shared with others or if you want to embed it on an intranet or share it with only a few people. Scribd adds a password to the embed link code so that a private document can still be easily made available to your selected audience. (Just bear in mind that if the code with the password is somehow leaked, anyone would be able to use it to view the document.)

Scribd can also function as a quick file converter, for example, making PDFs from OpenOffice.org, Word documents from PowerPoint, or text files from PostScript. Note that Scribd handles presentation files, albeit in a different fashion than the shareable presentation sites mentioned above. The ScribdPaper format is based on Macromedia's FlashPaper. If for no other reason, take a look at some documents on Scribd to explore how Macromedia's FlashPaper player works.

EMBEDDING SPREADSHEETS

While Scribd can handle some spreadsheets, it is not the only way to embed spreadsheets, just as it is not the only way to embed presentations. Spreadsheets can be used not only to keep track of and calculate numbers but also to easily create sequential lists and to store textual data in a tabular form.

Several Web 2.0 spreadsheet sites also offer embedding. Both Zoho Sheet and Google Docs & Spreadsheets allow an entire spreadsheet or just a range of cells to be published. For a range in Zoho Sheet, select the cells and then click on the "Publish" menu and "Publish this range." In Google, click the "Publish" tab, then "More publishing options," "HTML to embed in a webpage," specify a sheet, and then give the range of cells. One difference between the two is that with Zoho, a range of cells from a nonpublic spreadsheet can be embedded, while Google requires that the spreadsheet be public to let any of it be embedded. EditGrid and Num Sum also offer embeddable spreadsheets.

Unlike the video- and presentation-sharing sites, the spreadsheet sites have few community comment aspects. Num Sum offers both ratings and comments for public sheets, but none of the others do. This probably reflects how the content from the spreadsheet-sharing sites tends to be much more information- or task-oriented rather than entertainment-focused.

RSS FEEDS

One central concept behind RSS feeds is that they can be shared and reused. While much RSS viewing occurs as personal feed reading, using Bloglines or Google Reader, an RSS feed can also be embedded on a Web page. There is not yet a community-sharing site for feeds that makes it as easy to embed RSS feeds as YouTube, SlideShare, and Scribd simplify embedding their content. Yet several tools exist that help embed a feed without advanced programming knowledge.

Feedroll's RSS Viewer is an example. Input the URL of an RSS feed or choose one from the drop-down list, select from a list of design options, update, and Feedroll provides the embed code. Grazr offers a similar function but expands the options to include a whole list of RSS feeds as an OPML file (outline processor markup language that can contain a collection of feeds).

MORE EMBEDDING

Many other content types can be easily embedded as well. Screencasts can be generated as Flash files with the necessary code embedded on an HTML page. Images have been embedded since the earliest days of the Web. Look at a random MySpace page to be reminded of the potential annoyance of embedded audio files. Simply want to manage a list of items and make the list embeddable? FLEXlists provides a spread-sheetlike structure for creating and then embedding lists.

Even databases can be embedded. Zoho Creator can create an online database from an Excel spreadsheet, from its application gallery, or from scratch. Once a database has been created, various views of the database content can be embedded on a site. Each Zoho Creator view has an "Embed this page in your site" link that has an IFrame and JavaScript version of embedding code.

Sites for Embedding Content

You can see examples of embedded content at the SearchEngineShowdown Web site (searchengineshowdown.com/test/embed.shtml).

Presentations

Documents

Spreadsheets

RSS Feeds

Lists

Databases

SOURCE TRACKING ISSUES

While it is exciting to watch the ways people are using these sharing and embedding sites to combine content in new ways, a number of concerns become quickly evident as well. The growing embedding opportunities complicate the task of tracking down the original source of a document, not only for exploring intellectual property misappropriations but also for basic accuracy in citations. In some ways, embedding should make it easier. YouTube, SlideBurner, Zoho Creator, Feedroll, and others flag embedded content with their own service logo. Some have links back to the original content on their sites. When the content links back, that should make it easy to identify the content creator.

Unfortunately, too often what is loaded up on such sites is not necessarily coming from the original content creator or copyright holder. It can be confusing as to where the content originates. Find a useful document that you'd like to cite on Scribd? It may be an extract from some other previously published source. This is not necessarily the fault of the system, since Scribd provides space to add a description of the document that could contain full attribution. Unfortunately, most documents have, at best, only a limited description, even when they appeared to come from a previously published source.

Search for national geographic on Scribd. A number of people have uploaded photo collections from the National Geographic Society that even include the copyright notices on each photo. If any of these people contacted the society for approval to upload, no such notice appears within the document or the description. Scribd has a Flag Document link and specific information for how copyright owners can submit complaints under the Digital Millennium Copyright Act.

SEARCH ISSUES

The impact on searching varies. In general, at this point, it is safest to assume that the embedded content is not indexed by Google or other search engines. That is not to say the content on the sharing site might not be indexed. For example, Scribd documents may be indexed at Scribd, but the search engines will not likely find what other sites have the document embedded. Then again, I have run searches at Scribd that find documents that none of the search engines find.

For other types of documents, remember that most are embedded using a Flash-based viewer, and most Flash content is not indexed well. A page I created for July's column about custom search engines embeds some information from a Zoho Sheet. My page of state library custom search engines (www.searchengineshowdown.com/cse/search-state-libraries/) has been indexed by the search engines, but none of them indexed the content embedded on the page -- the list of the 50 states and the URLs of the state library sites.

Whenever content is contained within video, audio, or images, search engines generally are unable to index anything unless there is an accompanying text transcript or text description. Presentations may be displayed primarily as images or as a movie. SlideShare generates a text transcript that is shown on the bottom of the page, but it is not always accurate. For one presentation, built in PowerPoint, that I uploaded to SlideShare, the generated text changed from
    "Content Search Strategies"
to the space-filled
    “C onte nt S e a rc h S tra te g ie s”
Searching the space-filled phrase found the file at Google, Yahoo!, and Live, but none of the search engines found the page using the grammatical phrase--which is the way the text was entered and displayed within PowerPoint.

THE FUTURE OF EMBEDDING

We are still early in the development of this sharing and embedding environment. Several of these sites have only been available for a year or two. Some will likely fail, while others will grow and expand. (Given their lack of guaranteed permanence, be sure to back up any important data on a regular basis.) One advantage to storing content in such sites is the ability to circumvent local systems constraints. Many of these sites are relatively easy to use, and we can hope that their usability continues to improve. Another advantage is that the content can be updated once, which will then update the content on all the sites where it is embedded.

Visit a few of these sites to get familiar with the way data is presented, and search for items with information value amidst the sometimes overwhelming self-published dross. Even more importantly, consider ways in which the capabilities of these sites can help to share, disseminate, and retrieve information in new and exciting ways.