Header

Inferential Statistics

 

AIM:- To collect the data, find the sampling distribution, standard deviation, and standard error.





The dataset used for this assignment is “Salary of Data Scientists” which has been taken from Kaggle, a data science competition platform and online community of data scientists and machine learning practitioners under Google LLC.

DESCRIPTION OF THE DATASET

It displays the salaries offered to different posts in the field of data science across the globe, within the year 2020-2023.

 

The dimension of the dataset is 3755 rows X 11 columns.

 

It has the following columns:-

·         work_year      

·         experience_level

·         employment_type

·         job_title

·         salary  

·         salary_currency

·         salary_in_usd 

·         employee_residence   

·         remote_ratio

·         company_location

·         company_size


STEPS UNDERTAKEN 

1)      Data collection- In inferential statistics, data collection refers to the process of gathering information or observations from a population or a sample of that population. This process involves selecting a representative group of individuals, collecting relevant data through various methods (surveys, experiments, observations), and organizing the collected data into a dataset.

 

2)      Sampling Distribution- In inferential statistics, the sampling distribution is a theoretical distribution that describes the likelihood of obtaining different sample statistics from repeated random samples of the same size drawn from a population. It plays a crucial role in making inferences about population parameters based on sample statistics. It helps in hypothesis testing and constructing confidence intervals.


3)      Standard Deviation- The standard deviation (often denoted by the symbol σ for a population standard deviation or s for a sample standard deviation) measures how much individual data points in a dataset deviate from the mean. A low standard deviation indicates that the data points tend to be close to the mean, while a high standard deviation indicates that the data points are spread out over a wider range.

 

     σ = sqrt( Σ(xi- μ)² / n) ,where xi are the sample observations, x̅ is the sample mean, n is the sample size

4)      Standard error-The standard error (SE) is a measure of the variability or precision of a sample statistic. It quantifies the degree to which a sample statistic, such as the sample mean or sample proportion, is expected to deviate from the true population parameter.

 

If n is the sample size and x̅  the sample mean then,

Standard error(SE) = σ / √n  

The following section now contains creating sample distributions, one can use various sources, though the below samples have been generated using the language R.

 

The variable for which random samples have been generated is salary_in_usd and the sample size(n) chosen is 800.

 
SAMPLE DISTRIBUTION WITHOUT REPLACEMENT

 

Sampling without replacement refers to the process of selecting elements from a population in such a way that once an element is chosen, it is not put back into the population. As a result, each selection reduces the size of the population for subsequent selections. The sampling distribution without replacement is particularly relevant when considering finite populations.

 

The 800 samples generated are:-

 

[1] 137500 275000  67723  74178 153000  50000 180000 115000 128058 250000 117104 100000 153600 170000 168400 146115 165220

 [18] 210000 240000 195400 104611 231250  92350  33808 150000  29944 155000 150000 148500  98506 183600 250000  43809 230000

 [35]  83171 129300  24342  21013 119000 115000 106020  72000 106800 120000 116100 191200 119059 118208 160000 208000  75000

 [52] 200000 184000 138938  85000 112900 169200  50000  44365 252000  17805  75000 105200 101228 172200  20000 170550 120000

 [69] 171600  95000  73900 124234 198200  73546 172200 135000  10354 133766 130000 160000 106800  90320 129300 100706   8000

 [86] 140000 129300 203000  96113 190000 145000  94300 200000  75000 141525 135000 376080 180000 130000 110000 185000 185000

[103]  48000 128000  90000   9272   5723 190000 196000 117000  63000 204500 129300 120000 156400 195400 145000  48609  96100

[120] 236900 130000 208450 185900 173000  52533 130000  20171 129300 110000 210000 116000 160000 183500 170000 205000  48289

[137] 250000  80000 250000 191200 164000  42026  40000 200000 141525  69751 252000 170000  70500 162000 135000 169000  30000

[154] 250000 110000  33609 146000 143200 342810 140000 109280 130000 239000 139860 147100  30523 140000  20000 239748 130000

[171] 155000 140000 115440 247500 150000  61566 110600  70000 184000 147100  19522 196200 159200  29751 222200 192000 272550

[188]   6304 136000  84000 225000 179820  86466 184000 131300 187500 135000 195400 110000 216000 115500 126000 180000 176000

[205] 145000 140000 126000 150000 243225  59888 207000 140000 100000 153600 120000 150000  47280 178500 129300  51753 236000

[222]  62000 195000 172600  92000 180000  36773 150000 220000  85700 205000 179975 206699 170000  98506 175950  80000 185000

[239]  95000 236000 142200 275300 120000  24823 135000 110600 125000 144000 201036 138900 169000 100000  75000 100000 110000

[256] 112000 150000 148700 102100 206000 130000 110000  48289 184000 136000 114000 249500 310000  50000 120000  89200 198800

[273] 289800 250000 135000 125000 182000 291500 115000 115000 115360 120000 225000 200000 125000 170000  54685 129300 174500

[290]  49253 100000 130050 159000  85000 130000 199000 170000 150000 191475 130000 120000 129000 180000  57872 175000 184000

[307] 190000 116000 142200 135000  75000  85066  30000  31520 198800  90734 300000 150000 175000 284310  70186 149500  55000

[324] 150000 185000  98506  53192 135000 160000 132100 154000   7799 135000  48289 130000 236600 187000 138750 250000 122500

[341] 105200 150000 121700 165000 165000 119000  79833 130000 198000  63000 200000 185900 165000 205000 221484 120000 130500

[358] 128000 188800 115000 133300 141525 139000 114000 136000 123000  89294  85000 141525  48609  95000 288000 106250  80000

[375] 139500 112000 102000 106900 310000  66837  92700 275000 180000  95000 145000 140000 238000 153600 299500 150000 125000

[392] 210000 139000 225000 113000 260000  28369  37824 226700  18053 213660 239748  95000 139500 120000 280000 105000 250000

[409] 185100 120000 190000 100000 175000 249300 232200  78000 150000  84000 191475 110000  99000  73546 159699  68293  54742

[426] 150000 350000 200000 170000  90320 110000  63000 225000 140000 151902  20000 130000  36773 179305 175000  95000 156400

[443] 216200 175000 135000 203500  92350 128000 110820 135000 136000 127000 167000 183000 124000 180000 127000 133000 115000

[460]   7500  82744 122700 100000  70000 140000 180000 216000  63000 150000  65000  84900  75000  22892 206000 109000 192400

[477] 140000 156400 123700 145000 231250 191475 180000 171000  95000  22809 156600  50000  99000 162500 120250 140000 145000

[494]  55410 103000 196000 100000 130000 153400  75000 242000 123000  47280  92250 156400 154560 153600 174500 229000 107309

[511] 100000 140000  75000 105000 155499 160000 100000  63192 205600 129300  13493 168400 110000 200000  73880 370000 130000

[528] 175000 120000  78000 159000 132000 172200  65000 145000 175100 109400 205000   5409 120000  48000 216000  62000 123648

[545] 276000 138750 200000 110000  95000 130800 250000 155000  84053 235000  90320  70000 182000 275000 145000 286000 405000

[562] 150000 100000 112900 185700 165000 160000 150000 135000  65013 200000 135000 252000  80000 150000 135000 170000 165000

[579] 120250 152000 104663 120000 210000  46809  28476  85500 185000 160000 170000 141525  75000  99000 225000 150000 194000

[596] 216000 196000  80000  52500 130000 115934 141525 100000 153000 143865 128000  90320  85066 180180 203500 201000 150000

[613] 155000 110000 188100  80000 300000  80000 200000 131300 191200 240000 120000  95000  83171 230000 230000  20000  90700

[630] 230000 186000 135000 135000 215300 116450 160000 160000 145000  60000 170500 200000 162500  75000 140400  75000  75116

[647] 200000 226700 129000  95000 126000 116000 135000  20000  15966  48000 185000 218500 345600 135000 100000  95000  58837

[664] 108000  90000  61896 175000 191475 136000  75000  63040 205300 250000  74540 117100 174500 136000 310000 185900 122600

[681] 220000 120000 201450 160000  29453 102100 106500 170000 100000  95000  80036 175000  40663 190200 145000 187200  56100

[698] 149076 158677  99450 136000 220000 252000  52500 150900  55000  81666  76814 130000  73546 186000 130000 154600  86193

[715] 198800 100000  80000 130000 166000 100000  55000 253750 270703  42533 100000  87980  83270  68293  63831 152900  77684

[732] 120000 135000  95000 180180 179820 134000 160000 175000 145000 416000 115000  55000 212200 237000  65000 105000 210000

[749] 125976  53368 106000  75000  38400 136000 143860 124000 144000 112300 150000 135000 185900  13989  73742 112900 100000

[766] 262000  60000 237000 115000 200000 138000 160000 167875 102000 182500 161500  80041 150000 149076 160000 125600 270000

[783] 151800  90000 210000 140000  93700  74000  38000  96000 126000 145900 110000 195000  61467 190000 175000 106800 169000

[800] 247500

 

 

Mean =  x̅ =  Sum of observations/ no. of observations =    139454.1

 

SD = s =sqrt( Σ(xi- μ)² / n)= 64099.79

 

SE =  σ / √n = 2266.27

SAMPLE DISTRIBUTION WITH REPLACEMENT

 

When sampling is conducted with replacement, it means that each item selected from the population is returned to the population before the next selection. This process allows for the same element to be chosen more than once in the sampling process.

 

The 800 samples generated are:-

 

  [1]  25000 235000 240000  56536  60000  36773 160000 168400 110000 129300 199000 174500  46178 100000  84053 204500 180000

 [18] 150000 111000  19073  59102 130000 165000 174000 112900 133300 140000   5707 202800  82365  90000  61566 110000  90000

 [35] 130000 170000 188800 190000 100000  35610 100000  18238  60938 185000 120402 135000 183500 180180 155000 179000 185900

 [52] 160000 132320 123700 210000 120000 185000 148700  79833 120160 100000 225000 115000 252000 120000 180000  74540 130000

 [69] 102000 140000 150000 124500 184000 199000 162500 175000 160000  73546 131300 200000 112900 130000 108000 125000  90320

 [86] 110000 152380  48289 110600 309400 160000  86000  95000 185000 100000 110600 125000 198200 200000 194500 115000 200000

[103] 151800 260000  22809 121700  81500 150000  42923 105380 208000 131300 190000 180000  12000 110000 137400  58000  99100

[120] 132320 130000  86000 221000  89306 110000 155000  72200 120000 300000 196200  80000 120000 110000  40777  36773 109006

[137] 145000  12000 167580 135000 136260 216000  55685 291500 186000 125000  38000 191475 225000  63000 120000 153000 162000

[154] 156400 140000  63810 300240  53654 126000 122500 140100 139000 160000  49253  78990 105000  64000 191765  66100  20000

[171] 185900  30000 297300 200000 250000 247500  66837 145000 116914 153000 174500 120000  85500 123000 120000 136000  13493

[188] 160000 114047 147000 160000  65000 430967 130000 145000 210000 100000  85000  42533 235000 126000 180000  66265 118000

[205]  65488 148700 175000 155000 139500 140000  63000 100000 250000 179975 122900  90000  27317 135000  62000 185000 140000

[222] 156400 109400 289076 288000 113476 169200 120000  70000  90000 164996 158000 261500 163800 239748 220000 191475 106500

[239] 185900  58837 150000 210000 138000 260000 150000 205000  95000 191475 108800 135000 105000 125000 112000  92250 160000

[256] 184000 179500 169200 144100  76814 342300 113000 129300 115934 141300 159699 265000  20000 185000 106020 150000  40189

[273]  13493  75000 128750 140000 247500 214200  75000  75000 130000 185900  48000  88256  73900 280700  86000  88654  55410

[290]  80000  48289  58000  33511  75000 132300  75774 160000 160000 165000 200000 210000  73880 155000 131752  92000 128000

[307]  81000  40000 272000  45760 230000 175000 180000 100000 170000 132320 230000  52500 136000  85066 150000  84053 135000

[324] 124270  17509 169000  61566 170000  92350 142200  75000  73900 160000 174000  75116  75000 120000 123648  48000  50000

[341] 100000 185000   6359  67723 115000 230000 252000 262000 109280 100000 225900 153600 110600 141525 182000  76814 150000

[358] 107309 190000 178600 123700  42197  38400 135000 125000 110600 115934 100000 145000 101228  60000 110000 156400 230000

[375] 120000 185900  37824  94000  20000 175000  13000 100000 180000 195000 153600 150000 204500 165750 115000 151800  61896

[392] 115000 106500 200000 147000 149000  44365 130000 160000 119000  10354 161500 280000 314100  37824  90000 150000 189650

[409]  50000 127075  65488  67723 116000  87000 314100  75000 130000 253200 186000  93918  92350 214618 130000  64000 103691

[426] 110600  68293 120000 243900 145000 350000 175000 250000 237000 210000 195000 250000  85500  70000 180000 120000 110000

[443] 156400 182000 214200 124000 230400 190000 120000  90700 106020  63000  54685 156400 153090 154600  70000 130000 208049

[460] 110000 100000 152000 190000  87980 120000 150000 184000 100000 119059 310000 120000 175000 222200 128000  70000 150000

[477] 120000 180000 190000 132000 160000  65488  29944 109000 195400 119000  50000 126000 170000 102772  70000 150000 130000

[494] 123700  82000 100000 140000  60000 235000 120000  13000 140000 179400 110037  81666 108000  70139 283200 200000  63831

[511] 275300 188700 100000 105500  73546 205300  53654 120000  63000  68293  95386 216000  70186 265000 110000  99000  93700

[528]  12877 106020 160000 185900 124000 126000 128000 190000 120000 170000 260500  73900 240000 245000 250000  87000  78000

[545]  73880 171250 100000 155000 149040 121523  90000 185900 140000  20171 160000 114000 175000 200000 108800  20000  20000

[562] 136620 160000 105000 250000 186000 155000  33808 179975 145000  64200 128875 190000 105000 252000 153600 160000 150000

[579] 140000 139500 259000 220000 100000 145000 175000 200000 133300 162000 235000 234100 145000 170000  90734 160000 208775

[596] 280000  89294 207000 104000 257000 258000  78000  84053 135000 135000 160000  57723 275000  50432 136000 100000  10354

[613] 250000  54094  17805 108000 185000  71786  23000 125000  94564 200000  75020 110000 105000  98000  49253 110000  90700

[630]  60000 107250 131300 108800 105200 121700 180000 110600  57872 160000 210000  82528  81666 142000 291500 110000  18907

[647] 106000 236900 160000 195400 129400 210000 185900  12608  75000 237000 110000  46178 190000 175000  52533 120000 155000

[664] 160000  50000 104000 214618 204500 119500  72914 205000  70000 121500 106020 188100  41689 127000 325000 119059  42026

[681]  95000   7799 144000 170000 141525  94500 185900 252000 150000 135000 115000  90700 130000  55475 196000 160000 145000

[698]   5723 100000 174500 130000 204500 171600  50000 110000 129300 155000  84053 205300 127075 340000 129300 150000 100000

[715] 167580  67723 151410 297300 250000 150000  65000 135000 100000 222200  97218 225000 130000 185900  21013 112000 194500

[732]  48289 124740 191475 136000 231250  86193 179820 150000 185900 250000  95000 248100 191475 145000  80000 135000 130000

[749]  63192 185900 120000 170000  95000 160000 120000 160000 200000  92350 135000 170000 142200 250000 130000  69741  13400

[766] 231250 194000 160000 100000 112000 126000 190000 150000 185900 184000  12000 160080  72946  85000 145000  48289  63000

[783] 152900 109006 130000 142200 205000 201000  75000  49268  70000 120402  85000  38000 142200 128000  30000 192400 210000

[800] 192564

 

 

Mean =  x̅ =  Sum of observations/ no. of observations =    135492.7

 

SD = ssqrt( Σ(xi- μ)² / n) =  64703.85

 

SE =  σ / √n =2287.627





No comments:

Post a Comment

Keep it concise

Chintu ❤️

अब नहीं...❤️

सुधर सुधर के सुधरा हूँ  मैं फ़िर से बिगड़ जाऊँगा  तुम पूछोगे हाल मेरा  मैं इश्क़ में पड़ जाऊँगा