Getting Started with the LARC Dataset

Overview

LARC (Learning Analytics Data Architecture) is a research-focused data set containing information about students who have attended the University of Michigan since 1996. The data can help answer typical learning analytics questions about students, their academic careers, and their class outcomes.

You can access LARC from SQL clients (e.g., SQL Developer), statistical and data analysis software (e.g., Stata, R), visualization software (e.g., Tableau), or any other tool capable of connecting to an Oracle database. ITS recommends SQL Developer as a SQL client and Tableau as a visualization tool.

Note: Your access to LARC comes with or without Personal Identifying Information (PII), depending on your business needs. The PII table includes a student's U-M ID number (UMID), U-M uniqname, date of birth, and more.

Table of Contents

This document includes links to LARC resources and three ways to connect with the LARC data set:

Connect to LARC Data Set via SQL Developer

Download SQL Developer

Note: You will need to have administrative rights on your computer. If you do not, contact your Neighborhood IT support person to request the SQL Developer software.

  1. Verify that you have installed the Oracle client software on your computer. If you have not, contact your unit IT or download Oracle's instant client.
  2. Connect your computer via an ethernet cable or connect wirelessly to a campus VPN. For more information, see the MiWorkspace Work Remotely page.
  3. Navigate to SQL Developer Downloads on the Oracle webpage.
  4. Select the correct platform and click the Download link. Windows users should select the link for Windows 64-bit with JDK 8 included. You will be prompted to sign in or create a free Oracle account.
  5. Follow the instructions in the prompts to install Oracle SQL Developer.

Create a new database connection

The next step is to connect to the data.

  1. Click the green plus sign under Connections. The New/Select Database Connection window will then open.
  2. Enter a name for the connection in the Name field.
  3. Enter your username in the Username field.
  4. Enter your password for the database in the Password field.

Note: You received an email inviting you to collaborate in a private U-M Box folder titled with your uniqname. The folder contains your ID and password to the U-M Enterprise Data Warehouse Production (EDWPROD) environment.

  1. Select LDAP from the Connection Type drop-down list.
  2. The LDAP Server field should populate. Select the first option from the drop-down list:

oidprd1-ora.dsc.umich.edu:3060:3131.

Note: If no values display, enter oidprd1-ora.dsc.umich.edu:3060:3131 and press Tab.

  1. Select cn=OracleContext,dc=dsc,dc=umich,dc=edu from the Context drop-down list.

Note: If no values display, click Load.

  1. Double-click the appropriate database (e.g., edwprod) from the DB Service drop-down list.

Note: If no values display, click Load.

  1. Click Connect. You will be prompted to enter your password.

Create a query

In the example above, we connected to the EDWPROD database. The LARC data set is CNLYR002. In EDWPROD, navigate to Other Users - CNLYR002 - Tables or Other Users - CNLYR002 - Views to see the LARC tables and views to which you have been granted access.

The next step is to create a query using the Query Builder. Below is an example:

SELECT count(distinct stdnt_id) FROM CNLYR002.curr_stdnt_info where stdnt_asian_ind = 1;

This query is asking the database to return a distinct count (i.e., no duplications) of student IDs, where the Asian indicator = 1, from the CURR_STDNT_INFO view (contains only the most current data) of CNLYR002 (the LARC data set).

Connect to LARC Data Set via a Different Tool using JDBC

If you are connecting to the LARC data set in the Enterprise Data Warehouse from a tool other than SQL Developer, you can use a Java Database Connectivity (JDBC) connection string:

jdbc:oracle:thin:@(DESCRIPTION=
 (SOURCE_ROUTE=yes)
  (ADDRESS_LIST=
   (FAILOVER=on)
   (LOAD_BALANCE=on)
   (ADDRESS=(PROTOCOL=tcp)(HOST=oraconnp1.dsc.umich.edu)(PORT=30421))
   (ADDRESS=(PROTOCOL=tcp)(HOST=oraconnp2.dsc.umich.edu)(port=30421)))
  (ADDRESS=(PROTOCOL=tcp)(HOST=edwprod-db.dsc.umich.edu)(PORT=1521))
  (CONNECT_DATA=(SERVICE_NAME=rmt_edwprod)))

After connecting to the EDWPROD database, select CNLYR002 to see the LARC tables and views to which you have been granted access.

Notes:

  • In some applications, “CNLYR002” may be referred to as the schema name or service name.
  • If you are connecting from a tool that does not support making a connection through JDBC, please contact the ITS Service Center for technical support.

Connect to LARC Data Set via Google Cloud Platform (GCP)

Users who prefer to access LARC data as comma-separated values (.CSV) files can download the .ZIP files from the LARC Student folder, a read-only, shared folder on Google Cloud Platform used by everyone with access to the LARC data set. In addition, to release notes and a link to the LARC data dictionary, the folder includes the Zipped Data files: LARC Student Info, LARC Student Term Info, LARC Student Term Class Info, and LARC Student Term Transfer Info. You can download the files and work with them in Excel or another tool.

The LARC flat files do not contain any Personal Identifying Information (PII). PII data is available only in the database.

  1. Navigate to the LARC Student folder on Google Cloud Platform using the link provided in the LARC documentation and log in.
  2. Check the box next to the file you want and select Download.
    Note: Individual data files represent a single snapshot. For example, LARC_20200527_STDNT_INFO.csv.bz2 is basic student info (STDNT_INFO) from the May 27, 2020 snapshot (20200527).
  3. Navigate to the folder on your computer where you saved the file and extract (i.e., unzip) it.
  4. Open the .CSV file in Excel or another tool.
    Note: You may only see a partial data set if you use Excel, TextPad, or other basic tools that cannot display the full volume of data

Resources

Tags: 
Last Updated: 
Wednesday, February 24, 2021