Please tell us about yourself. How did you end up in such an offbeat, unconventional and unusual career?
Aziz Eram, a master’s student at the University of Arkansas at Little Rock studying information quality, is one of six graduate students who participated in the South Big Data Hub’s DataStart Program. DataStart provides funding that allows talented graduate students to work as student fellows with startups who need data science talent. She served her summer fellowship with Black Oak Analytics in Little Rock. Below are her thoughts about the program.
Acxiom Corporation announced that Aziz Eram was awarded an Acxiom Diversity Scholarship in the amount $5,000. The Acxiom Diversity Scholarship Program was designed by Acxiom Corporation to create an inclusive and supportive workplace environment. Eram’s award was one of 10 Diversity Scholarships given by Acxiom in 2016.
Eram joined the UALR (University of Arkansas, Little Rock) MSIQ (Masters in Information Quality) program in fall 2015 after completing her Master of Science in Applied Mathematics from Osmania University in India. She also completed her undergraduate degree in Statistics and Computer Science from Osmania University in 2012. Eram is also a graduate research assistant on a project sponsored by Black Oak Analytics, Inc., a Little Rock-based information quality company.
Tell us about the scholarship
David Knowles, Director of Economic Development and Engagement at the University of North Carolina at Chapell Hill, announced on May 10th that Aziz Eram, a student in the University of Arkansas at Little Rock (UALR) Master of Science in Information Quality program, has been awarded an internship though the Southern Startup Internship Program in Data Science known as “DataStart.” The DataStart program was created as way to provide graduate students from the 16 states that comprise the South Big Data Regional Innovation Hub (South BD Hub) the opportunity to work with data-related startup companies on data-intensive business challenges important to the company. Eram’s startup sponsor is Black Oak Analytics, Inc, in Little Rock, Arkansas. Black Oak was incubated out of the UALR Information Quality Graduate Program and specializes in high-performance, Big Data integration solutions. Students in the DataStart program are paid for their work through a grant to the startup company. The complete list of DataStart interns and their sponsoring companies can be found here.
Tell us about your internship
I had the opportunity to intern at Black Oak Analytics, a Little Rock data startup, through a DataStart Fellowship managed by the South Big Data Hub. I did not come to the program with any industry knowledge, but I have a bachelor’s degree in computer science and statistics and also a master’s in applied mathematics. I was excited to be hired as an intern at Black Oak and to say that I have learned a lot in my internship is an understatement. I have grown tremendously, learning foundational data mining and data-driven marketing skills. Black Oak Analytics is a company that provides advanced solutions that allow organizations of any size to convert data into recommendations and actions designed to improve profitability, competitiveness, and customer satisfaction.
What does the company actually do? Once you gather large amounts of data about your customers and prospects, the quality controls around that data often remain a low priority. The effectiveness and success of any solution is directly tied to the quality and organization of the data it is based on. Poor data quality can be costly and damage a company’s reputation. By assessing the full lifecycle of an organization’s data, from initial source acquisition through internal and external systems, Black Oak Analytics can identify areas in which improvements can be made to the quality and treatment of data. Black Oak uses software called the High Performance Entity Resolution System (HiPER), an entity identity information management system that supports the full lifecycle of entity identity information. Also Black Oak offers a rock-solid data governance plan to help customers make sense of their most valuable asset.
Black Oak’s mission is to a become their clients’ trusted partner by helping them manage information as a corporate asset and use it as a competitive differentiator. The talent that surrounded me at Black Oak was fantastic and I am very fortunate to have worked for a company that values collaboration, creativity, and culture. This internship gave me the opportunity to get my foot in the door while building on my education, helped me develop professionally, and fueled my confidence.
My internship mainly focused on data integration of unstructured entity references. The primary goal of my work was to develop and test a more general approach to the problem of resolving entity references in free text format. To do this I have been using HiPER, which runs as a stand-alone entity identity management service and mainly focuses on increasing both reliability and matching of data. HiPER has a plugin interface that builds custom comparators in addition to a wide array of built-in, industry standard comparator functions.
What was your role?
Many industries and companies have data that exists in free text format, such as merchant/transaction descriptions on credit card statements, retail inventory details, medical and pharmacy records, etc. I was provided with two main data sets:
- Lender name data sourced from public record information. This kind of data is mostly used by third party data compilers to create hotline marketing files of new homeowners and new borrowers.
- Credit card transaction data from one of the top three credit card issuers in the country.
My task was to design and implement two new comparators called Business Parser and MAC (Multi-Valued Attribute Comparator). The Business Parser Comparator helps to match different unstructured data to a single structured data identifier. For example: “FREEDOM MTG CORP,“ “FREEDOM MOBILE HM SALES INC” and “FREEDOM MTG CONSULTANTS INC” were matched to a single identity “FREEDOM MORTGAGE CORPORATION.” While Business Parser generates a matching link based only on one identity attribute, MAC generates a matching link based on more than one attribute.
How will these comparators be useful? If the data is in structured format it will be useful to organizations in many ways. For example, it can be used to generate more accurate reports, which in turn can result in improved inventory management, elimination of inconsistent pricing, improved sales, and improved operational efficiencies. A majority of my work at Black Oak Analytics dealt with entity resolution practices.
How did you benefit from this internship?
I have learned many different skills during my internship, including data mining, data matching, and data linking, and these will all help me to build my career in data science. I want to thank my supervisors, Steve Sample and Dr. John Talburt, the HiPER team, and all the members of Black Oak for supporting and guiding me. Without their help, I would not have been able to complete the project. Last but not the least, I am extremely thankful to the DataStart program for giving me this wonderful opportunity to work with these amazing people.
In an interview, Eram said “I am honored and grateful to have been selected as a recipient of the Acxiom Diversity Scholarship. This award will not only allow me to continue my studies at UALR, but it also reaffirms my faith that hard work does get recognized and appreciated, which is indeed a great feeling. This is a big development in my life as it gives me a moral boost and creates self-confidence to excel and continue to pursue academic excellence through my current studies and beyond. I would also like to extend special thanks to Dr. Talburt for guiding me, since I started with IQ Program.”