University of Limerick Institutional Repository

CoDS: A representative sampling method for relational databases

DSpace Repository

Show simple item record

dc.contributor.author Buda, Teodora Sandra
dc.contributor.author Cerqueus, Thomas
dc.contributor.author Murphy, John
dc.contributor.author Kristiansen, Morten
dc.date.accessioned 2014-08-05T08:58:17Z
dc.date.available 2014-08-05T08:58:17Z
dc.date.issued 2013
dc.identifier.uri http://hdl.handle.net/10344/3932
dc.description peer-reviewed en_US
dc.description.abstract Database sampling has become a popular approach to handle large amounts of data in a wide range of application areas such as data mining or approximate query evaluation. Using database samples is a potential solution when using the entire database is not cost-e ective, and a balance between the accuracy of the results and the computational cost of the process applied on the large data set is preferred. Existing sampling approaches are either limited to speci c application areas, to single table databases, or to random sampling. In this paper, we propose CoDS: a novel sampling approach targeting relational databases that ensures that the sample database follows the same distribution for specific fields as the original database. In particular it aims to maintain the distribution between tables. We evaluate the performance of our algorithm by measuring the representativeness of the sample with respect to the original database. We compare our approach with two existing solutions, and we show that our method performs faster and produces better results in terms of representativeness. en_US
dc.language.iso eng en_US
dc.publisher Springer en_US
dc.relation.ispartofseries 24th International Conference on Database and Expert Systems Applications (DEXA 2013) [ Lecture Notes in Computer Science];8055, pp. 342-356
dc.relation.uri http://dx.doi.org/10.1007/978-3-642-40285-2_30
dc.rights The original publication is available at www.springerlink.com en_US
dc.subject relational database en_US
dc.subject representative database sampling en_US
dc.title CoDS: A representative sampling method for relational databases en_US
dc.type info:eu-repo/semantics/conferenceObject en_US
dc.type.supercollection all_ul_research en_US
dc.type.supercollection ul_published_reviewed en_US
dc.contributor.sponsor SFI en_US
dc.relation.projectid 10/CE/I1855 en_US
dc.rights.accessrights info:eu-repo/semantics/openAccess en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search ULIR


Browse

My Account

Statistics