using Jupyter notebook Pick one dataset from https://data.sanjoseca.gov/dataset

1. Data Description and Curiosity Questions about the data:

- background or the context of data selected – sources, description of how it was collected, time period it represents, context in it was collected if available,
- reason(s) why you selected it?
- Description of the data:
- how big is it (number of observations, variables),
- how many numeric variables,
- how many categorical variables,
- description of the variables, if available
- Are there any missing values?
- Any duplicate rows?

- Compute summary statistics (mean, median, mode, standard deviation, variance, range).
- Select one categorical variable, compute these statistics on a numeric variable by grouping on a categorical variable
- Record your observation. What did you find the most fascinating from your descriptive analysis.

2. Descriptive Statistics and Visualization (at least two out of the four listed below)

- Relationship between variables
- Trend
- Distribution of the variable(s)
- Spatial data representation
- Comparison of summary statistics across categories

3. Generate at least one hypothesis and perform hypothesis test.

4. Summarize your observations

Please make sure to write as much as you can for the summarization and answering all the questions

Also use the template i’m providing it to you.

