数据科学
 		
 			课程编码:1801010812I0P1002Y
 			英文名称:Data Science
 			课时:60
 			学分:3.00
 			课程属性:学科核心课
 			主讲教师:罗铁坚
 		
 		
 			
教学目的要求
 			This course is a professional seminar for graduate students in computer software and theory. Its purpose is to enable students to master the basic content of network science and understand its application fields. This course focuses on the common models of network science.  The requirements for students are follows: Master the basic methods of network science including common models and algorithms; Master the main ideas of network modeling and network behavior analysis. This course enables computer graduate students to deeply grasp the scientific research trends in the direction of network science, the latest technology, and have a preliminary understanding of the application of different fields of network science. It is supposed to cultivate the research learning ability of graduate students, broaden their horizons, and lay a solid foundation for future research and application.
 		
 			
预修课程
 			Calculus, Linear Algebra, Probability and Statistics
 		
 			
大纲内容
 			第一章 Introduction (Week 1) 4.0学时 罗铁坚
第1节  Course contents
第2节  Teaching outcomes
第3节  Competency =[Knowledge+Skills+Disposition]in Task
第4节  Assignment and assessment
第二章 Research Topics (Week 2) 4.0学时 罗铁坚
第1节  Ten Research Areas
第2节  Data Science Paradigm
第3节  Domain Knowledge and Data Science
第4节  Accelerating invention and discovery
第5节  Exercises
第三章 Domain Problems  and Case Study (Weeks 3, 4) 8.0学时 罗铁坚
第1节  Instructional objectives
第2节  Case 1 Computing Lens for Social Science
第3节  Case 2 Visualizing Seattle Bicycle Counts
第4节  Case 3 Predicting Bicycle Traffic
第5节  Case 4 The optical character recognition problem: the identification of hand-written digits
第6节  Case 5 k-means for color compression
第7节  Exercises
第四章 IPython-Beyond-Normal-Python(Weeks 5, 6) 8.0学时 罗铁坚
第1节  Instructional objectives
第2节  IPython Magic Commands
第3节  IPython and Shell Commands
第4节  Erros and Debugging
第5节  Exercises
第五章 Introduction to NumPy (Weeks 7, 8) 8.0学时 罗铁坚
第1节  Instructional objectives
第2节  Understanding Data Types in Python
第3节  The Basics of NumPy Arrays
第4节  Structured Data: NumPy's Structured Arrays
第5节  Exercises
第六章 Data Manipulation with Pandas(Week 9) 4.0学时 罗铁坚
第1节  Instructional objectives
第2节  Pandas Objects
第3节  Data Indexing and Selection
第4节  Aggregation and Grouping
第5节  Exercises
第七章 Visulization with Matplotlib (Week 10) 4.0学时 罗铁坚
第1节  Instructional objectives
第2节  Line and Scatter Plots
第3节  Histograms, Binnings, and Density
第4节  Three-Dimension Plotting
第5节  Exercises
第八章 Machine Learning 4.0学时 罗铁坚
第1节  Instructional objectives
第2节  What is ML?
第3节  Introducting Scikit-Learn
第4节  Hyperparameters and Model Validation
第5节  Feature Engineering
第九章 Special Topics 1 (Week 12) 4.0学时 罗铁坚
第1节  Instructional objectives
第2节  Naive Bayes Classification
第3节  Linear Regression
第4节  Exercises
第十章 Special Topics 2 (Week 13) 4.0学时 罗铁坚
第1节  Instructional objectives
第2节  Support Vector Machines
第3节  Decision Trees and Random Forests
第4节  Exercises
第十一章 Special Topics 3 (Week 14) 4.0学时 罗铁坚
第1节  Instructional objectives
第2节  Principal Component Analysis
第3节  k-Means Clustering
第4节  Exercises
 		
 			    
教材信息
 			    
		      	  	  1、
		      	  	  High-Dimensional Data Analysis
			      	  John Wright and Yi
			      	  2022年1月
			      	  Cambridge University Press
		      	  	
		      	 
 		
 			
参考书
 			
 		
 		
 			
课程教师信息
 			Tiejian Luo is a professor at the University of Chinese Academy of Sciences and a doctoral tutor. He was the Executive Dean of the School of Information Science and Engineering at the Graduate School of the Chinese Academy of Sciences. He has hosted more than 10 national and corporate research projects, published a total of 112 articlesand a monograph in English. He has more than 30 software copyrights and invention patents. Since 2003, the team leading by him designed and implemented the education cloud of the University of Chinese Academy of Sciences and has been running successfully for more than 10 years. The University of Chinese Academy of Sciences owns all the intellectual property rights of more than 30 application systems (more than 6 million lines of source code) of the cloud platform. In recent years, he has published many academic papers in high level conferences and journals such as AAAI and IEEE Transactions on Cybernetics, and proposed new models for specific problems in natural language understanding and computer vision, and refreshed the accuracy of related public data sets. He received the 2017 Chinese Academy of Sciences Excellent Teacher Award and the first prize of the Excellent Instructor of the National College of Intelligent Intelligence Competition of China Artificial Intelligence Society.