University of Limerick Institutional Repository

Text mining StackOverflow: an insight into challenges and subject-related difficulties faced by computer science learners

DSpace Repository

Show simple item record

dc.contributor.author Joorabchi, Arash
dc.contributor.author English, Michael
dc.contributor.author Mahdi, Abdulhussain E.
dc.date.accessioned 2016-07-05T11:05:20Z
dc.date.available 2016-07-05T11:05:20Z
dc.date.issued 2016
dc.identifier.citation Joorabchi, A; Michael, E; Mahdi, AE (2016) 'Text mining StackOverflow: an Insight into Challenges and Subject-Related Difficulties Faced by Computer Science Learners'. Journal Of Enterprise Information Management, 29 (2):255-275. en_US
dc.identifier.uri http://hdl.handle.net/10344/5099
dc.description peer-reviewed en_US
dc.description.abstract Purpose The use of social media and in particular community Q&A websites by learners has increased significantly in recent years. The vast amounts of data posted on these sites provide an opportunity to investigate the topics under discussion and those receiving most attention. The purpose of this article is to automatically analyse the content of a popular computer programming Q&A website, StackOverflow, determine the exact topics of posted Q&As, and narrow down their categories to help determine subject difficulties of learners. By doing so, we have been able to rank identified topics and categories according to their frequencies and, therefore, mark the most asked about subjects and, hence, identify the most difficult and challenging topics commonly faced by learners of computer programming and software development. Design/methodology/approach In this work we have adopted a heuristic research approach combined with a text mining approach to investigate the topics and categories of Q&A posts on the StackOverflow website. Almost 160,000 Q&A posts were analysed and their categories refined using Wikipedia as a crowd-sourced classification system. After identifying and counting the occurrence frequency of all the topics and categories, their semantic relationships are established. This data is then presented as a rich graph which could be visualized using graph visualization software such as Gephi. Findings Reported results and corresponding discussion has given an indication that the insight gained from the process can be further refined and potentially used by instructors, teachers and educators to pay more attention to and focus on the commonly occurring topics/subjects when designing their course material, delivery and teaching methods. Research limitations/implications The proposed approach limits the scope of the analysis to a subset of Q&As which contain one or more links to Wikipedia. Therefore, developing more sophisticated text mining methods capable of analysing a larger portion of available data would improve the accuracy and generalizability of the results. Originality/value The application of text mining and data analytics technologies in education has created a new interdisciplinary field of research between the education and information sciences, called Educational Data Mining (EDM). The work presented in this article falls under this field of research; and it is an early attempt at investigating the practical applications of text mining technologies in the area of computer science education. en_US
dc.language.iso eng en_US
dc.publisher Emerald en_US
dc.relation.ispartofseries Journal of Enterprise Information Management;29 (2), pp. 255-275
dc.relation.uri http://www.emeraldinsight.com/doi/full/10.1108/JEIM-11-2014-0109
dc.rights This article is (c) Emerald Group Publishing and permission has been granted for this version to appear here http://ulir.ul.ie. Emerald does not grant permission for this article to be further copied/distributed or hosted elsewhere without the express permission from Emerald en_US
dc.subject education data mining en_US
dc.subject technology supported learning en_US
dc.subject social learning en_US
dc.subject text mining en_US
dc.subject course design and delivery en_US
dc.title Text mining StackOverflow: an insight into challenges and subject-related difficulties faced by computer science learners en_US
dc.type info:eu-repo/semantics/article en_US
dc.type.supercollection all_ul_research en_US
dc.type.supercollection ul_published_reviewed en_US
dc.date.updated 2016-06-23T13:36:16Z
dc.description.version ACCEPTED
dc.identifier.doi 10.1108/JEIM-11-2014-0109
dc.rights.accessrights info:eu-repo/semantics/openAccess en_US
dc.internal.rssid 1597873
dc.internal.copyrightchecked Yes
dc.description.status peer-reviewed


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search ULIR


Browse

My Account

Statistics