课程大纲

课程大纲

数据科学

课程编码:0812I0D01002H 英文名称:Data Science 课时:60 学分:4.00 课程属性:一级学科核心课 主讲教师:罗铁坚

教学目的要求
This course is a professional seminar for graduate students in computer software and theory. Its purpose is to enable students to master the basic content of network science and understand its application fields. This course focuses on the common models of network science. The requirements for students are follows: Master the basic methods of network science including common models and algorithms; Master the main ideas of network modeling and network behavior analysis. This course enables computer graduate students to deeply grasp the scientific research trends in the direction of network science, the latest technology, and have a preliminary understanding of the application of different fields of network science. It is supposed to cultivate the research learning ability of graduate students, broaden their horizons, and lay a solid foundation for future research and application.

预修课程
Discrete?Mathematics

大纲内容
第一章 Introduction (Week 1) 4.0学时 罗铁坚
第1节 Course contents
第2节 Teaching outcomes
第3节 Competency =[Knowledge+Skills+Disposition]in Task
第4节 Assignment and assessment
第二章 Research Topics (Week 2) 4.0学时 罗铁坚
第1节 Ten Research Areas
第2节 Data Science Paradigm
第3节 Domain Knowledge and Data Science
第4节 Accelerating invention and discovery
第5节 Exercises
第三章 Domain Problems and Case Study (Weeks 3, 4) 8.0学时 罗铁坚
第1节 Instructional objectives
第2节 Case 1 Computing Lens for Social Science
第3节 Case 2 Visualizing Seattle Bicycle Counts
第4节 Case 3 Predicting Bicycle Traffic
第5节 Case 4 The optical character recognition problem: the identification of hand-written digits
第6节 Case 5 k-means for color compression
第7节 Exercises
第四章 IPython-Beyond-Normal-Python(Weeks 5, 6) 8.0学时 罗铁坚
第1节 Instructional objectives
第2节 IPython Magic Commands
第3节 IPython and Shell Commands
第4节 Erros and Debugging
第5节 Exercises
第五章 Introduction to NumPy (Weeks 7, 8) 8.0学时 罗铁坚
第1节 Instructional objectives
第2节 Understanding Data Types in Python
第3节 The Basics of NumPy Arrays
第4节 Structured Data: NumPy's Structured Arrays
第5节 Exercises
第六章 Data Manipulation with Pandas(Week 9) 4.0学时 罗铁坚
第1节 Instructional objectives
第2节 Pandas Objects
第3节 Data Indexing and Selection
第4节 Aggregation and Grouping
第5节 Exercises
第七章 Visulization with Matplotlib (Week 10) 4.0学时 罗铁坚
第1节 Instructional objectives
第2节 Line and Scatter Plots
第3节 Histograms, Binnings, and Density
第4节 Three-Dimension Plotting
第5节 Exercises
第八章 Machine Learning 4.0学时 罗铁坚
第1节 Instructional objectives
第2节 What is ML?
第3节 Introducting Scikit-Learn
第4节 Hyperparameters and Model Validation
第5节 Feature Engineering
第九章 Special Topics 1 (Week 12) 4.0学时 罗铁坚
第1节 Instructional objectives
第2节 Naive Bayes Classification
第3节 Linear Regression
第4节 Exercises
第十章 Special Topics 2 (Week 13) 4.0学时 罗铁坚
第1节 Instructional objectives
第2节 Support Vector Machines
第3节 Decision Trees and Random Forests
第4节 Exercises
第十一章 Special Topics 3 (Week 14) 4.0学时 罗铁坚
第1节 Instructional objectives
第2节 Principal Component Analysis
第3节 k-Means Clustering
第4节 Exercises

参考书
1、 Think Like a Data Scientist.Tackle the data science process step-by-step Brian Godsey March 2017 manning

课程教师信息
Tiejian Luo is a professor at the University of Chinese Academy of Sciences and a doctoral tutor. He was the Executive Dean of the School of Information Science and Engineering at the Graduate School of the Chinese Academy of Sciences. He has hosted more than 10 national and corporate research projects, published a total of 112 articlesand a monograph in English. He has more than 30 software copyrights and invention patents. Since 2003, the team leading by him designed and implemented the education cloud of the University of Chinese Academy of Sciences and has been running successfully for more than 10 years. The University of Chinese Academy of Sciences owns all the intellectual property rights of more than 30 application systems (more than 6 million lines of source code) of the cloud platform. In recent years, he has published many academic papers in high level conferences and journals such as AAAI and IEEE Transactions on Cybernetics, and proposed new models for specific problems in natural language understanding and computer vision, and refreshed the accuracy of related public data sets. He received the 2017 Chinese Academy of Sciences Excellent Teacher Award and the first prize of the Excellent Instructor of the National College of Intelligent Intelligence Competition of China Artificial Intelligence Society.