ECON 570 Problem Set 3

economic代写 A. Loadthe Lalonde experimental dataset with the lalonde data method from the mod- ule causalinference.utils. The outcome…

Lalonde NSW Data economic代写

A. Loadthe Lalonde experimental dataset with the lalonde data method from the mod- ule causalinference.utils. The outcome variable is earnings in 1978, and the co- variates are, in order:

Black                   Indicator variable; 1 if Black, 0 otherwise.

Hispanic             Indicator variable; 1 if Hispanic, 0 otherwise.

Age                    Age in years.

Married              Marital status; 1 if married, 0 otherwise. economic代写

Nodegree           Indicator variable; 1 if no degree, 0 otherwise.

Education           Years of education.

E74                     Earnings in 1974.

U74                    Unemployment status in 1974; 1 if unemployed, 0 otherwise.

E75                    Earnings in 1975.

U75                    Unemployment status in 1975; 1 if unemployed, 0 otherwise.

Using CausalModel from the module causalinference, provide summary statistics for the outcome variable and the covariates. Which covariate has the largest normalized difference? economic代写

B. Estimate the propensity score using the selection algorithm est propensity s. In se- lectingthe basic covariates set, specify E74, U74, E75, and  What are the additional linear terms and second-order terms that were selected by the algorithm?

C. Trimthe sample using trim s to get rid of observations with extreme propensity score values. What is the cut-off that is selected? How many observations are dropped as a result?

D. Stratifythe sample using stratify  How many propensity bins are created? Report the summary statistics for each bin.

E. Estimate the average treatment effect using OLS, blocking, and matching.For match- ing, set the number of matches to 2 and adjust for bias. How much do the estimates differ?

economic代写
economic代写

Document Classification economic代写

 

A. From the module sklearn.datasets, load the training data set using the method fetch 20newsgroups.This dataset comprises around 18000 newsgroups posts on 20 Print out a couple sample posts and list out all the topic names.

B. Convert the posts (blobs of texts) into bag-of-word vectors.What is the dimensionality of these vectors? That is, what is the number of words that have appeared in this data set?

C. Use your favorite dimensionality reduction technique to compress these vectors into ones of K = 30 dimensions.

D. Use your favorite supervised learning model to train a model that tries to predict the topic of a post from the vectorized representation of the post you obtained in the previous step. economic代写

E. Use the test data to tune your  Make sure to include K as a hyperparameter as well. Use accuracy score from sklearn.metrics as your evaluation metric. What is the highest accuracy you are able to achieve?

更多代写:python作业代写 雅思代考 assignment代写 澳洲论文代写 经济学代写 conclusion结构