Research
Home Research News Videos Security Data Access Server Watch Education About

 

XQuery ] Modeling ] Data Mining ] Optimization ] Trends ] SOX ] der Black Box ] [ BlackBox ] SQL Trees ]

Fast, reliable data access for ODBC, JDBC, ADO.NET and XML
WSSC 2008: An event dedicated to SOA and Web Services Security
Got SOX compliance?
Movielink Logo 88x31
Business Intelligence with R&R ReportWorks
IBM eserver xSeries 306m 8849 - P4 3.4 GHz
Memory
PROLIANT BL20P G3 XEON 3.6G 2P
iTunes Logo 88x31-1

 

 

The literature about information retrieval, SQL, query languages and database technologies is extensive. Many documents are available on the Internet but finding useful information among billions of pages can be daunting. These selections include academic papers, white papers, specifications, briefings, conference presentations, and articles from journals, magazines and web sites: 

Data Access

Data Mining

Modeling
Information Retrieval
Query Optimization

SQL/XML

Web Farming

 


Database Systems: The First Generation discusses the emergence of database systems and results of the CODASYL Systems Committee Database Systems survey of 1968.

, the keynote presentation at Enterprise Data Forum 2003, covered SQL databases, vector databases, grid computing, web services, data mining, ubiquitous connectivity, and bio-electrical links.

DSQL an SQL for structured documents
Arijit Sengupta, Mehmet Dalkilic (Indiana University)

The authors introduce DSQL, an SQL language for querying structured documents. DSQL is declarative and does not require users to navigate through data. It is based on a theoretical foundation of a declarative language (document calculus) and an equivalent procedural language (document algebra).

What First Normal Form Really Means is a two-part article by C. J. Date.

On Structured Types 
Chris Date discusses structured (user-defined, decomposable) types, tuples and scalar types.

Trees in SQL
Joe Celko discusses the adjacency list and nested set approaches to implementing hierarchies in SQL.

iPod Family

Similarity Searching and Domain Vocabularies
Ken North discusses information retrieval technology. Machine learning and domain expertise are useful (individually or in combination) to provide better searches.

Data Access

Service Data Objects
This specification from BEA and IBM defines objects for accessing data from heterogeneous data sources (relational databases, XML data stores, enterprise applications and services).

ODBC, JDBC und die Suche nach der Black Box
ODBC, JDBC and the Search for a Black Box

Modeling

Michael David

What makes this paper of significant importance to the SQL/XML industry is it proves how standard ANSI SQL can perform full multi-leg hierarchical processing. It explains how the relational Cartesian processing engine automatically and inherently performs Lowest Common Ancestor (LCA) logic that's required to perform hierarchical processing.

More links for modeling

SQL/XML

Alternative Schemes for Mapping XML to Databases
Daniela Florescu, Donald Kossmann

There are a variety of solutions for mapping from documents to tables. This is an interesting paper about the comparative performance of several mapping schemes. 

Automatically Utilizing XML's Untapped Semantic Goldmine
Michael David 

The author explores the benefit of exploiting XML's hierarchical semantics by modeling hierarchical structures in SQL. He explains that the Left Outer Join can preserve data hierarchies and SQL can model network type (multi-path) structures.

Clio: Mapping XML and Relational Schemas
Lucian Popa, Mauricio A. Hernández, Yannis Velegrakis, Renée J. Miller, Felix Naumann, Howard Ho

An IBM research project developed Clio to map between relational data and XML schemas. Clio first does a semantic translation to generate a logical mapping. Then it the logical mapping into an SQL query or XQuery.

Efficient XML-to-SQL Query Translation: Where to Add the Intelligence?
Rajasekar Krishnamurthy (IBM Almaden Research Center),  Raghav Kaushik (Microsoft Research), Jeffrey F. Naughton (university of Wisconsin-Madison)

The authors present a translation algorithm for generating efficient SQL for path expressions over tree schemas.

SQL:2003 Has Been Published
Andrew Eisenberg, Krishna Kulkarni, Jan-Eike Michels (IBM), Fred Zemke, Jim Melton (Oracle), 

This document explains the changes to the SQL standard, including part 14 (SQL/XML). Part 2 (Foundation) includes new data types, a new MERGE statement, a sequence generator object, and enhancements to CREATE TABLE.

Overstock.com Auctions! Easier, Cheaper, Friendlier!

Standard SQL
Peter Gulutzan

This document analyzes what features of the SQL:1999 and SQL:2003 standards are supported by major DBMS vendors.

Optimization Techniques for Mapping Documents to Databases
Holger Meyer, Meike Klettke (University of Rostock)

The authors discuss the statistical analysis of documents to optimize the storage of documents in object-relational databases.

Query Optimization

Issues in Real-World Query Optimization and Processing
Hĺkan Jakobsson (Oracle).

Notes from a guest lecture at Stanford University.

More links for query optimization

Data Mining, Web Farming

Iko Pramudiono, Takahiko Shintani, Takayuki Tamura, Masaru Kitsuregawa

Abstract: Data mining is becoming increasingly important since the size of databases grows even larger and the need to explore hidden rules from the databases becomes widely recognized. Currently database systems are dominated by relational database and the ability to perform data mining using standard SQL queries will definitely ease implementation of data mining. However the performance of SQL based data mining is known to fall behind specialized implementation. In this paper we present an evaluation of parallel SQL based data mining on large scale PC cluster.

More links for data mining and web farming

Information Retrieval

Hierarchical Structures and 4GLs
Michael M. David (Advanced Data Access Technologies)

This white paper discusses the processing of hierarchical data structures. It discusses entity relationships, recursive structures, hierarchical semantics and the use of SQL for processing queries over hierarchical data.  It also discusses the application of Lowest Common Ancestor logic, XML joinless access and mapping network structures to hierarchical structures.
 

A Comparison of String Metrics for Matching Names and Records
William W. Cohen, Pradeep Ravikumar, Stephen E. Fienberg (Carnegie Mellon University)

Abstract: We describe an open-source Java toolkit of methods for matching names and records. We summarize results obtained from using various string distance metrics on the task of matching entity names. These metrics
include distance functions proposed by several different communities, such as edit-distance metrics, fast
heuristic string comparators, token-based distance metrics, and hybrid methods.
 

Visit Grid Computing Technology Portal (GridSummit.com)


    

 


© 2004-5,  Ken North Computing, LLC. All rights reserved.
This page was last updated on 22-Aug-2007 .