Jingfeng Li

Seeking a position of Data Scientist or Research Scientist

I hold a PhD in computational neuroscience and am currently a Data Incubator fellow. I developed predictive behavioral models in my PhD work. At the Data Incubator, I have built a model that effectively predicts loan outcome. I have also built my own data processing pipeline and have extensive experience with MatLab, python, and SQL (See Resume). I am looking for a full-time Data Scientist position.


GitHub Projects

Lending Club Cracked (An ensemble mode)

On average, 18% of loans in Lending Club go to default. I developed an ensemble model (including gradient boosting decision tree, naive bayse and logistic regression) to help people spot those faulty loans, reducing investment risk by 50%.

Digit Recognition Neural Network

I initially developed this network as a graduate school class project. It is a multilayer perceptron used for recognizing handwritten digits. Training is fulfilled by using a back-propagation algorithm. I used a popular dataset (‘mnist_all.dat’) comprising of training and testing samples of the different digits is provided in the Midterm folder. Each sample is a 28x28 gray scale 8-bit image.

Sales Number Forecast based on Random Forest

I built a random forest model to predict the monthly unadjusted Canadian retail sales number. The prediction explains 98.66% of the variance in the actual, historical data. In almost every case (probability of 0.95), the error in the prediction (the absolute difference between the actual data and the prediction) is less than ~5% of the average sales number.

Regression Based Dependency Analysis

MATLAB code for regression based dependency

The goal of this analysis is to assess whether one process drives the correlation between two other signals. Ideally, one should extract just the shared (correlated) component of the signals, and then use causality analysis (e.g., Granger causality analysis or transfer entropy) to determine whether the process of interest drives that correlated component. Methods for extracting the correlated component require a large number of correlated signals as input to operate properly. With small amounts of correlated signals, the extracted component will necessarily include a substantial amount of non-correlated signal, regardless of the precise method used. Regression based dependency analysis circumvents the need for extracting only the correlated component.

Results are published in 'Li J, Bentley W, Snyder L. Neural Correlates of Functional Connectivity. Organization for Human Brain Mapping, 2015.'

FCMRI model

A python project of the model described in Honey et.al 2007 PNAS, with expansion to capture both ultra-slow (minutes) and ultra-fast (millisecond) neural dynamics.

Publications & Talks

Publications


Li JM, Bentley WJ, Snyder AZ, Raichle M, Snyder LH (2015) Functional connectivity arises from a slow rhythmic mechanism, Proc Natl Acad Sci U S A. In press.

Bentley WJ*, Li JM*, Snyder AZ, Raichle M, Snyder LH (2014) Oxygen level and LFP in task positive and task negative areas: Bridging BOLD fMRI and electrophysiology. Cerebral Cortex: In press. (* Equal contribution)

Kubanek J, Li JM, Snyder LH (2015) Motor role of parietal cortex in a monkey model of hemi-spatial neglect, Proc Natl Acad Sci U S A. In press.

Li J, Bentley W, Snyder L. Functional Connectivity in Unit Activity. Organization for Human Brain Mapping, 2015.

Li J, Bentley W, Snyder L. Neural Correlates of Functional Connectivity. Organization for Human Brain Mapping, 2015.

Li J, Bentley W, Snyder L. Neural activity underlying functional connectivity MRI. Society for Neuroscience, 481.01. 2014

Bentley WJ, Li J, Snyder A, Raichle M, Snyder L. Functional connectivity arises from a slow rhythmic mechanism. Society for Neuroscience, 481.03. 2014

Bently W, Li J, Snyder A, Raichle M, Snyder L. Bridging BOLD fMRI and Electrophysiology: Oxygen polarography in awake macaques. Organization for Human Brain Mapping, 2013.

Bently W, Li J, Snyder A, Raichle M, Snyder L. Oxygen polarography and electrophysiology in the default-mode and dorsal-attention networks during rest and stimulation: Bridging BOLD fMRI and electrophysiology. Society for Neuroscience, 586.05. 2013

Selected Talks


Neural activity underlying functional connectivity MRI 2014
Society for Neuroscience Conference
Washington, DC
Neural mechanisms of functional connectivity MR imaging 2014
Anatomy and Neurobiology Department Trainee Seminar
St Louis, MO
Oxygen level and local field potential in a default mode area: 2013
Bridging BOLD fMRI and electrophysiology
Society for Neuroscience Conference
San Diego, CA
Local Oxygen Fluctuations in the Brain 2012
Cognitive, Computational and Systems Neuroscience Symposium
St Louis, MO

Jingfeng Li

Education


Ph.D. Neuroscience (Computational) Washington University in St. Louis 2015
B.S. Biotechnology Peking University, China 2010


Work Experience


Data Science Fellow, the Data Incubator

2015-present

•  Selected from 1800+ applicants to participate in an intensive two month data science
    fellowship program
•  Built an ensemble model to predict loan status using 500,000 LendingClub records,
    decreasing the chance of funding a failure loan by 50% (LINK)



Consultant, the BALSA group

2014-2015

Evaluated a startup company (Mobius ABX) for BioGenerator, an investment firm
•  Evaluated market size, profit margin, competitive landscape, regulatory and technical
    assessments, and alternative markets for a novel drug developed by Mobius ABX
    to treat endophthalmitis
•  Conducted primary research and primary interviews with physicians and pharmacists
    to identify state of the art endophthalmitis treatments and receptiveness to new
    treatment options
•  Co-authored a report for BioGenerator based on compiled research, which was used
    as a primary source in funding decisions

Patent analysis involving a range of fields, including next generation DNA sequencing, bio-sensing nanoparticles, and cancer therapy

Graduate Researcher/Postdoctoral Fellow

2010-present

•  Built a pipeline from data collection, to data transfer, statistical analysis, and result
    visualization, using C, Shell scripting, MATLAB and python
•  Built a system in C that both tracks an experimental animal’s performance in a task,
    including eye and arm movements, and instantaneously modifies task design based on
    the performance
•  Developed multivariate causality methods based on information theory and Granger
    causality. Analyzed over 100 GB data, leading to 11 publications and a successful
    funding proposal of $380,000 (MatLab)
•  Implemented a physiologically relevant, feature-based K-mean clustering for spike
    sorting. Analyzed over 20 GB of data, decreasing the false negative rate by 20%
    (MatLab)
•  Developed a three-layer artificial neural network for handwritten digit recognition, using
    over 60,000 pictures, with over 90% accuracy (MatLab)
•  Built a random forest model to forecast month-to-month Canadian retail sales numbers.
    Analyzed 20 years of economic data, increasing accuracy by more than 20%
    compared to existing methods (Python)





Other Experiences


Teaching Assistant

2011

Neurophysiology Lab, BIO 404
•  Prepared and delivered course materials, led group discussion, performed laboratory
    demonstrations
•  Graded and provided feedback on experimental reports



NeuroDay Presenter, Saint Louis Science Center

2011

•  Created and led a demonstration to illustrate the limitations of the human attention span
•  Reviewed the literature on the neural mechanism of attention
•  Designed an intuitive heuristic that was accessible to non-experts



Director of Organization Department, Peking University Chinese Literature Club

2008


Summer Olympic and Special Olympic Volunteer

2008




Technical Skills


C, Python, MATLAB, SQL, Scala, Hadoop Mapreduce, Spark
Machine learning, Scikit Learn, decision tree, naive bayesian, artificial neural networks, natural language processing, web scraping, statistical analysis, probabilistic theory, information theory
Unix/Linux (Shell scripting)



Awards and Honors


Finalist for O'Leary Prize 2015
(Award recognizing outstanding dissertation research)
Acceptance into Cognitive, Computational, and Systems Neuroscience pathway 2014
(National Institutes of Health funded training program)
Ellen Eoyang Scholarship 2008
(Award recognizing outstanding academic performance)
Outstanding Olympics Volunteer Award 2008
3rd Prize of China Biology Olympiad (Provincial Level) 2005
2rd Prize of China Mathematical Olympiad (Provincial Level) 2000









CONTACT INFO

Jingfeng Li
jingfengmli@gmail.com
jingfengli@go.wustl.edu
314-616-2743