Weighting the BNES 2019 data

This short post describes the procedure used to calculate the post-stratification weights included in the 2019 Belgian National Electoral Study (BNES) data. Organized by ISPO, the BNES is a post-election study of the Belgium federal elections. The election study combines traditional politically orientated questions (e.g. voting behaviour, left-right orientations), socio-demographic variables (e.g. occupation, ethnicity, gender) and attitudinal dispositions (e.g. democratic values, anti-immigrant stands). The weights available in the dataset have been computed using the statistical suite R (R Core Team 2019) and the survey package version 4.0 (Lumley 2020).

Post-stratification weights

The 2019 BNES is designed to comprise a representative sample of the Belgian electorate. The target population of the BNES consists of all Belgian residents older than 18, eligible to vote for the 2019 Federal Election. Despite the effort to obtain a sample that closely resembles the target population, some groups of people are easier to interview than others. This phenomenon is well known in survey research and is related to two different types of “biases” or “errors.” The first one is called “sampling error” and refers to the fact that some groups of people are more difficult to reach than others. The second one is referred as “non-response bias” or the tendency of some groups of people to be less likely to agree to be interviewed than others.

To mitigate the potential mismatch between the BNES sample and the target population, we computed post-stratification weights. These weights are employed to give more or less importance (“weight”) to those individuals that have been more or less difficult to reach or interview than others. For instance, if the Belgian electorate consists of 50% females and 50% males, but the sample consists of 40% females and 60% males, we can use a weight that makes the male observations in our sample count less and the female observations count more.

Different methods can be used to generate weights (for an overview see, Pew Research 2018). The method generally used at ISPO is a population-based method, in which information on population subclasses is used to calculate weighting coefficients. The main reason to employ such type of post-stratification weights is that our variables of interest (e.g. anti-immigration attitudes) are likely to vary as a function of a set of individual-level characteristics such as education, age, or the different geographical areas in the country.

Post-stratification weights should be ideally calculated using census data. However, Belgium conducted the last census in 2011. Using such data to compute the weights is not appropriate since we would match our sample to the 2011 Belgian population. As commonly done, we use population data from the European Union Labour Force Survey (LFS). The LFS data have been provided by Ellen Quintelier (thanks!) of Statbel, the “Direction Générale Statistique” of Belgium.

Weights in the BNES 2019 data

The BNES data include six variables for weighting:

  1. w_age_bel: weighting coefficients (w) for the joint distribution of age (a), gender (g) and education (e) for the Belgian sample (bel).
  2. w_age_wal: weighting coefficients (w) for the joint distribution of age (a), gender (g) and education (e) for the Walloon sample (wal). Brussels respondents are included in the calculation of the w_age_wal weights.
  3. w_age_vla: weighting coefficients (w) for the joint distribution of age (a), gender (g) and education (e) for the Flemish sample (vla).
  4. w_agev_bel: weighting coefficients (w) for the joint distribution of age (a), gender (g), education (e) and voting behaviour (v) for the Belgian sample (bel).
  5. w_agev_wal: weighting coefficients (w) for the joint distribution of age (a), gender (g), education (e) and voting behaviour (v) for the Walloon sample (wal). Since respondents from the Brussels-capital region can vote for both Flemish and Walloon parties, they have been excluded from the w_agev_wal weights calculations (set as missings).
  6. w_agev_vla: weighting coefficients (w) for the joint distribution of age (a), gender (g), education (e) and voting behaviour (v) for the Flemish sample (vla).

In this post, I focus on the calculation of w_age_bel and w_agev_bel. The calculations for the Wallonia and Flanders samples are just a matter of subsetting the data according to the corresponding regions.

Demographics weights

The first step to calculate post-stratification weights is to pre-process the BNES and the LFS data set. Specifically, we need to manipulate our survey data so that the R survey package can automatically match the BNES age, sex, education, region contingency table with the population frequencies from the LFS data set.

## Read the LSF data ##
lsf <- readxl::read_excel("kub_BNES_2019.xlsx",
                          sheet=1)
table(lsf$REG1)

BXL VLA WAL 
 42  42  42 
table(lsf$sex1)

MEN WOM 
 63  63 
table(lsf$age18)

18-27 28-37 38-47 48-57 58-67 68-77    78 
   18    18    18    18    18    18    18 
table(lsf$educat3c)

 1  2  3 
42 42 42 

The LFS data set is aggregated by specific categories. For instance, education (educat3c) has 3 categories corresponding to low, medium, and high level of education. These correspond to ISCED education codes used in the BNES survey to measure respondents' educational level. However, in the BNES, the education variable has 10 categories. The category from 1 to 5 corresponds to low level of education (ISCED 0, 1, 2), 6 to 8 corresponds to medium level of education (ISCED 3, 4), and 9 to 10 corresponds to high level of education (ISCED 5, 6, 7, 8). Thus, we need to recode the education variable in the BNES data set to match the one in the LFS. In addition, to match the structure of the LFS dataset, aggregating some categories allows us to reduce the number of cells with only a few respondents. This introduces a small bias in the weights but greatly helps to reduce instabilities and the standard errors of the (regression) estimates.

## Read the BNES data ##
BNES_2019 <- haven::read_sav("BNES_2019_complete_update_02_10_2021.sav")

BNES_selected <- BNES_2019 %>% select(age,
                                      PROVINCE,
                                      region,
                                      q13,       # education 
                                      q2,        # gender 
                                      addressID, # respondent's ID 
                                      q24        # self-reported vote
                                      )

# recode education to match the LSF data
BNES_selected$educat3c <- ifelse(BNES_selected$q13 <=5 , 1,
                                  ifelse(BNES_selected$q13 >5 & BNES_selected$q13 <=8 , 2,
                                    ifelse(BNES_selected$q13 >8 & BNES_selected$q13 <=10, 3, NA
                                         )))

Similarly, we need to recode the rest of the variables that we are going to use to calculate the weights, namely age, gender, and region. Age is aggregated in 7 categories: 18-27; 28-37; 38-47; 48-57; 58-67; 68-77 and plus 78. The 3 regions are Flanders (VLA), Wallonia (WAL) and Brussels-capital (BXL). The assigned sex at birth is female (WOM) and male (MEN).

# recode age to match the LSF data
BNES_selected$age18 <- ifelse(BNES_selected$age >=18 & BNES_selected$age <=27, "18-27", 
                        ifelse(BNES_selected$age > 27 & BNES_selected$age <=37, "28-37",
                          ifelse(BNES_selected$age > 37 & BNES_selected$age <=47, "38-47",
                            ifelse(BNES_selected$age > 47 & BNES_selected$age <=57, "48-57",
                              ifelse(BNES_selected$age > 57 & BNES_selected$age <=67, "58-67",
                                ifelse(BNES_selected$age > 67 & BNES_selected$age <=77, "68-77",
                                  ifelse(BNES_selected$age > 77, "78",NA
                          )))))))

# recode region to match the LSF data
BNES_selected$REG1 <- ifelse(BNES_selected$region==1, "VLA",
                             ifelse(BNES_selected$region==2, "WAL",NA
                             ))



# in the original dataset, Brussels-capital region is not present so we are going 
# to use the province variable to identify Brussels's respondents 
BNES_selected$REG1 <-  ifelse(BNES_selected$REG1=="WAL" & BNES_selected$PROVINCE==2, "BXL", BNES_selected$REG1)

# recode sex to match the LSF data
BNES_selected$sex1 <- ifelse(BNES_selected$q2==1, "MEN", "WOM")

Next, we need to exclude those respondents who have missing values on one of the variables that we are going to use to compute the weights. In other words, the BNES contingency table of age, gender, education, and region should be without any empty cell. If you have a high number of missings on one of the matching variables, you can impute the missing values using the r package Amelia. If you do not have solid reasons to impute the missing observations, I would not recommend it.

# remove self-reported vote choice. we need this later.
BNES_selected_ager <- BNES_selected %>% select(-q24)
# row-wise NAs deletion 
data_wo_na <- BNES_selected_ager[(!is.na(BNES_selected_ager$educat3c) & !is.na(BNES_selected_ager$REG1) & !is.na(BNES_selected_ager$age18) & !is.na(BNES_selected_ager$sex1)), ]
# select only the columns that we need 
data_wo_na <-  data_wo_na %>% select(addressID, sex1, age18, REG1, educat3c)
# transform our variable to factors to feed the survey package
data_wo_na_f <-  data_wo_na %>% mutate_at(vars(sex1,age18,REG1,educat3c),factor)

The next step consists of calculating the population margins and the corresponding expected frequencies. The population expected frequencies tell us how the marginal distribution of the different groups in the sample would look like if the sample resembled the population. Hence, we need to multiply the population margins by the total number of observations in the BNES data.

# calculate the population totals
totals <- lsf %>% 
  group_by(sex1, REG1, age18, educat3c) %>% 
  summarize(count = sum(CW_ALL_Q_Sum))

# calculate the expected frequency proportionally to the total number of observations in the BNES data.
marignals_share <- totals %>% 
  ungroup() %>% 
  mutate(marignals_share = count / sum(count)) %>% 
  mutate(Freq=marignals_share * nrow(data_wo_na_f))

kable(head(marignals_share))
sex1REG1age18educat3ccountmarignals_shareFreq
MENBXL18-27124656.650.00273654.446835
MENBXL18-27236213.560.00401926.531126
MENBXL18-27316971.640.00188363.060841
MENBXL28-37121671.590.00240523.908479
MENBXL28-37222445.380.00249114.048031
MENBXL28-37355279.430.00613529.969663

The count table shows how many observations we would have had in each group if the relative frequencies of these groups in the BNES data are the same as those observed in the population.

The next step is to run postStratify() to calculate the weights. The function calculates weighting coefficients so that the sample group sizes are as they would be in a stratified sample. In other words, it matches the relative frequencies of the BNES data set with the one observed in the LFS data set.

The postStratify() function requires:

  1. a svydesign object with the survey design info such as the respondent id (rid) (rid is mandatory)
  2. same data structure between the sample data set and population data set
  3. a column named Freq for the population margins
## 1. svydesign object. In this case, we specify only the respondent unique identifier
data_unweighted <- svydesign(ids=~addressID, data=data_wo_na_f)
## 2. drop any unused factor level
marignals_share_f <- marignals_share %>% 
  mutate_at(vars(sex1, REG1, age18, educat3c), factor)
## 3. keep only the population margins (called Freq)
freq_b <- marignals_share %>% 
  select(-marignals_share, -count) 

## Post stratification AGE ##
ps <- postStratify(design = data_unweighted,                 # svydesign object
                   strata = ~sex1 + age18 + educat3c + REG1, # variables for the post stratification 
                   population = freq_b,                      # population margins
                   partial = TRUE)                           # ignore strata not present in the sample  

## extract the weights to plot later
non_trimmed_w_df <- data.frame("raw_weights" = weights(ps))

In certain scenarios, the computed weights might be too large. This happens when the sample margins are very different compared to the population margins. For instance, if our sample contains only 1 female respondent aged between 18 and 25 with university education but our target population is Belgian university students, the weight assigned to this respondent would be extremely large. Fortunately, we can trim the weights to avoid to have observations that are much larger than others. This introduces a small bias in the weights coefficients but greatly reduce standard errors (Kish 1992). We followed the ESS and trimmed the weights at the value of 4.

## Trimming ##
trimmed_w <- trimWeights(ps, 
                         upper = 4,  
                         strict = T
                         ) 

trimmed_w_df <- data.frame("trimmed_weights" = weights(trimmed_w))

## Plotting Weights ##
plot_non_trimmed <- ggplot(non_trimmed_w_df, aes(x=raw_weights)) + 
                    geom_histogram(binwidth = 0.1) + 
                    geom_vline(xintercept = 1, 
                           linetype = "dashed", 
                           color = "red", 
                           size = 0.4) + 
                  xlab("Raw weights") +
                  ylab("Count") +
                  theme_classic()


plot_trimmed <- ggplot(trimmed_w_df, aes(x = trimmed_weights)) + 
                geom_histogram(binwidth = 0.1) + 
                geom_vline(xintercept = 1, 
                           linetype = "dashed", 
                           color = "red", 
                           size = 0.4) + 
                  xlab("Trimmed weights") +
                  ylab("Count")  +
                  theme_classic()

plot_non_trimmed + plot_trimmed + plot_annotation(
  title = 'Demographic weights',
  caption = 'Raw and trimmed (>4) AGE weights')

Finally, we merge the weights with the untouched dataset. We use the respondent id (addressID) to merge the two datasets together. This automatically set the weights to NA for those respondents with missing on the matching variables.


## Merge the weights with the BNES data ##
binded <- cbind(data_wo_na_f, weights(trimmed_w))
# rename the weight variable according to the type of weights calculated (i.e. AGE)
names(binded)[grep("weights", names(binded))] <- "w_age_bel"
# select the weights and the rid 
binded_selected <- binded %>% 
  select(w_age_bel, addressID) 

# merge the weight variable into the original dataset using the respondent ID
# this allows us to assign an NA for those obs for which we did not compute a weight 
BNES_2019 <- left_join(BNES_2019,       # original dataset 
                       binded_selected, # weights and the rid 
                       by = "addressID" # merging by rid 
                       )

## Check everything is in order ##
# missing weights only for obs with missing on the matching variables
id_na <- BNES_2019[ ,"addressID"][is.na(BNES_2019[c("w_age_bel")])]

list_na <- c()
for (i in array(unlist(id_na))) {
list_na[[as.character(i)]] <- unlist(BNES_selected[BNES_selected$addressID==i, ])
}

kable(head(data.frame(purrr::reduce(list_na, rbind)), 5))
agePROVINCEregionq13q2addressIDq24educat3cage18REG1sex1
outNA724243905121NAWALWOM
eltNA224243409521NABXLWOM
elt.1NA223143119121NABXLMEN
elt.2NA223241111131NABXLWOM
elt.3NA224241103771NABXLWOM

Voting behaviour weights

Similarly to the tendency of some individual to be less likely to participate in the survey, voters, non-voters, and voters of certain parties might be more or less likely to take part in the survey itself. To account for imbalance in turnout and in levels of party support, we can calculate post-stratification weights that take into account the 2019 election results and matches the self-reported vote in the BNES survey. This type of weights combines demographic information and the correct distribution of the vote share of each political parties at Election Day.

The procedure to calculate voting behaviour weights is a bit more complicated compared to the previous one. For the demographic weights, we used the joint distribution of the demographic variables. This means that we have the LFS estimated number of males living in Wallonia, aged between 18-28, with a high level of education. For voting behaviour, we do not have a complete cross-classification for the grouping variables. We only know the marginal distribution, that is, the total number of people who voted for a certain party, who cast a blanc or null vote, and who did not vote. In other words, we do not know the number of males living in Wallonia, aged between 18-28, with a high level of education who voted for the Socialist Party or cast a null vote. As such, we need to use a technique called raking.

Given two different contingency tables, raking searches for the values to assign to the cells of the first table such that its marginal counts (the row and column “totals”) are the same as in the second table. For example, we know from the population data that our sample should be 48% male and 52% female with 80% of turnout during the election day. The raking procedure will first adjust the weights so that the gender ratio in the survey matches the desired population distribution. Next, the weights are adjusted so that the voters and non-voters marginal distribution in the survey matches the population figures. If the adjustment for voting makes the sex distribution out of alignment, the weights are adjusted till all of the post-stratification variables match their specified targets. That’s why this procedure is called raking. The name refers to the process of “raking” a garden bed alternately in each direction to smooth out the soil.

In addition to the LFS data, we gather the official 2019 electoral results and derived the vote share of each political parties at Election Day, the share of null and blank votes, and the share of non-voters. We acquired the 2019 electoral results from the website of the Federal Public Services Home Affairs. The vote share at the Election Day of the following political parties have been included in the weights calculations: CDV, NVA, Open VLD, s.pa, VB, Groen only for Flanders; PVDA/PTB for both Flanders and Wallonia; PS, MR, CDH, Ecolo, DEFI, PP only for Wallonia. Any other party (i.e. francophone party voted in Flanders) have been excluded from the calculations since not present in the BNES data.

## recode self-reported voting ##
# 50=blank and null votes 
# 97=other party
# PTB and PVDA are counted separately in the federal results but it is a single party
# 99=NAs 

BNES_selected$vote <- ifelse(BNES_selected$q24 == 8, 97,                                    # francophone p
                        ifelse(BNES_selected$q24 == 9 | BNES_selected$q24==19, 97,          # other p
                          ifelse(BNES_selected$q24 == 16, 7,                                # merge PTB PVDA 
                            ifelse(BNES_selected$q24 == 77 | BNES_selected$q24 ==99, NA,    # 77 DK 99 No resp
                              ifelse(BNES_selected$q24 == 50 | BNES_selected$q24 == 51, 50, # merge blank and null 
                                 BNES_selected$q24
                        )))))



# remove missings 
data_wo_na <- BNES_selected[(!is.na(BNES_selected$educat3c) & !is.na(BNES_selected$REG1) & !is.na(BNES_selected$age18) & !is.na(BNES_selected$sex1) & !is.na(BNES_selected$vote)), ]
# factorize 
data_wo_na_f <-  data_wo_na %>% 
                 select(addressID, sex1, age18, REG1, educat3c, vote) %>%
                 mutate_at(vars(sex1, age18, REG1, educat3c, vote), factor)

## recode official election results to match the coding using in the sample ##
results_federal <- readr::read_csv("2019_results_federal.csv")

results_federal$party  <- c(
  15, # ecolo
  3,  # open vld
  14, # cdh
  13, # mr
  18, # pp
  2,  # NVA
  5,  # VB
  1,  # C&V
  17, # defi
  7,  # pvda
  97, # other party 
  6,  # groen
  4,  # spa
  12, # PS
  rep(97, length(15:32)), # other party 
  50, # invalid 
  52  # absentee 
  )

The next step consists of calculating the population margins and the corresponding expected frequencies for both the joint distribution of age, gender, education, and region and for the voting. Let’s start with the voting frequencies.

## frequencies voting ##
totals <- results_federal %>% 
          group_by(party) %>% 
          summarize(count = sum(votes))

marignals_share <- totals %>% 
                   ungroup() %>% 
                   mutate(marignals_share = count / sum(count)) %>% 
                   mutate(Freq = marignals_share * nrow(data_wo_na_f))

# factorise and rename to feed the rake() function  
freq_voting <- marignals_share %>% 
               mutate_at(vars(party), factor) %>%
               select(-marignals_share, -count)

names(freq_voting)[1] <- "vote" 

kable(head(freq_voting))
voteFreq
1113.67733
2205.04388
3109.30283
485.85117
5152.85593
678.07835

Next, we need to calculate the frequencies of sex age education taking into account the geographical location of the respondent (region). One issue has been glossed so far. If we are unlucky enough to sample no one from a certain population stratum, it is not possible to calculate the weights. One of the major difference between the procedure to calculate the demographic and the voting behaviour weights is the handling of such empty strata. When the option partial = TRUE , postStatify() automatically ignores any empty cell in the computation of the weights. The rake() function does include a partial = TRUE option. We could aggregate the data such as we have fewer but larger groups. Since we have just a few missing strata and only for the Brussel region, we are going to manually remove these empty cells.

## frequencies gender reg age eduction ##
totals <- lsf %>% group_by(sex1, REG1, age18, educat3c) %>% 
          summarize(count = sum(CW_ALL_Q_Sum))

# we can check the presence of any empty strata (0) in our survey data using xtab
# sum(xtabs(~sex1 + REG1 + age18 + educat3c,
#      data_wo_na_f)==0)


# remove empty cells for rake
totals <- totals[!c(totals$educat3c == 1 & totals$REG1 == "BXL" & totals$sex1== "MEN" & totals$age18=="68-77"),  ]
totals <- totals[!c(totals$educat3c == 1 & totals$REG1 == "BXL" & totals$sex1== "MEN" & totals$age18== "78"),    ]
totals <- totals[!c(totals$educat3c == 1 & totals$REG1 == "BXL" & totals$sex1== "WOM" & totals$age18== "18-27"), ]
totals <- totals[!c(totals$educat3c == 1 & totals$REG1 == "BXL" & totals$sex1== "WOM" & totals$age18== "58-67"), ]
totals <- totals[!c(totals$educat3c == 1 & totals$REG1 == "BXL" & totals$sex1== "WOM" & totals$age18== "78"),    ]
totals <- totals[!c(totals$educat3c == 2 & totals$REG1 == "BXL" & totals$sex1== "MEN" & totals$age18== "58-67"), ]
totals <- totals[!c(totals$educat3c == 2 & totals$REG1 == "BXL" & totals$sex1== "MEN" & totals$age18== "68-77"), ]

marignals_share <- totals %>% 
                   ungroup() %>% 
                   mutate(marignals_share = count / sum(count)) %>% 
                   mutate(Freq = marignals_share * nrow(data_wo_na_f))

# factorise and rename to feed the rake() function  
marignals_share_f <- marignals_share %>% 
                     mutate_at(vars(sex1, REG1, age18, educat3c), factor)

freq_ager <- marignals_share %>% 
             select(-marignals_share, -count) 

kable(head(freq_ager))
sex1REG1age18educat3cFreq
MENBXL18-2714.262968
MENBXL18-2726.261078
MENBXL18-2732.934282
MENBXL28-3713.746872
MENBXL28-3723.880654
MENBXL28-3739.557439

Finally, we are going to use the rake() function to iteratively match the population margins with the sample margins.

data_unweighted <- svydesign(ids=~addressID, data=data_wo_na_f)

## Run raking (IPF) for ager and voting ##
s_rake <- rake(design = data_unweighted, 
                        sample.margins = list(~sex1 + age18 + educat3c + REG1, ~vote), 
                        population.margins = list(freq_ager, freq_voting)
                     )

non_trimmed_wv_df <- data.frame("raw_weights" = weights(s_rake))

## Trimming ##
trimmed_w <- trimWeights(s_rake, 
                         upper = 4,
                         strict = T) 

trimmed_wv_df <- data.frame("trimmed_weights" = weights(trimmed_w))

# Plotting ##
plot_non_trimmed_v <- ggplot(non_trimmed_w_df, aes(x=raw_weights)) + 
                      geom_histogram(binwidth = 0.1) + 
                      geom_vline(xintercept = 1, 
                               linetype = "dashed", 
                               color = "red", 
                               size = 0.4) + 
                    xlab("Raw weights") +
                    ylab("Count")  +
                    theme_classic()

plot_trimmed_v <- ggplot(trimmed_wv_df, aes(x = trimmed_weights)) + 
                  geom_histogram(binwidth = 0.1) + 
                  geom_vline(xintercept = 1, 
                           linetype = "dashed", 
                           color = "red", 
                           size = 0.4) + 
                    xlab("Trimmed weights") +
                    ylab("Count")  +
                    theme_classic()

(plot_non_trimmed_v + plot_trimmed_v) + plot_annotation(
  title = 'Voting weights',
  caption = 'Raw and trimmed (>4) AGE weights')

Let’s merge the weights with the BNES dataset and check that we achieved the desired outcome.


## check everything is in order ##
# calculate vote share population
marignals_vote <- results_federal %>% group_by(party) %>% 
                                      summarize(count = sum(votes)) 

freq_vote_pop <- marignals_vote %>% 
                 ungroup() %>% 
                 mutate(Freq = count/sum(count))

# check vote choice is the same across sample and population 
kable(cbind(svymean(~vote, s_rake),freq_vote_pop$Freq))
vote10.07376850.0737685
vote20.13305900.1330590
vote30.07092980.0709298
vote40.05571130.0557113
vote50.09919270.0991927
vote60.05066730.0506673
vote70.07157710.0715771
vote120.07855610.0785561
vote130.06278690.0627869
vote140.03071380.0307138
vote150.05098760.0509876
vote170.01841320.0184132
vote180.00919430.0091943
vote500.05363740.0536374
vote520.11619860.1161986
vote970.02460640.0246064

## Merging with BNES data set ##
binded <- cbind(data_wo_na_f,weights(trimmed_w))
names(binded)[grep("weights",names(binded))] <- "w_agev_bel"
binded %>% select(w_agev_bel,addressID) -> binded_selected
BNES_2019 <- left_join(BNES_2019, binded_selected, by = "addressID")

Unweighted and weighted cross-tables

Cross-table w_age_bel

Cross-table w_agev_bel

R session information

R version 4.0.3 (2020-10-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 10.16

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] grid      stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] knitr_1.31           patchwork_1.1.1      gtsummary_1.3.6.9015 gt_0.2.2             survey_4.0           survival_3.2-7       Matrix_1.2-18        forcats_0.5.1        stringr_1.4.0        dplyr_1.0.4          purrr_0.3.4          readr_1.4.0          tidyr_1.1.2          tibble_3.0.6        
[15] ggplot2_3.3.3        tidyverse_1.3.0     

loaded via a namespace (and not attached):
 [1] httr_1.4.2          jsonlite_1.7.2      splines_4.0.3       modelr_0.1.8        assertthat_0.2.1    highr_0.8           cellranger_1.1.0    yaml_2.2.1          gdtools_0.2.3       pillar_1.4.7        backports_1.2.1     lattice_0.20-41     glue_1.4.2          uuid_0.1-4         
[15] digest_0.6.27       rvest_0.3.6         colorspace_2.0-0    htmltools_0.5.1.1   pkgconfig_2.0.3     broom_0.7.5         haven_2.3.1         scales_1.1.1        webshot_0.5.2       processx_3.4.5      officer_0.3.16      downlit_0.2.1       generics_0.1.0      farver_2.0.3       
[29] ellipsis_0.3.1      withr_2.4.1         cli_2.3.0           magrittr_2.0.1      crayon_1.4.1        readxl_1.3.1        evaluate_0.14       ps_1.5.0            fs_1.5.0            fansi_0.4.2         broom.helpers_1.2.0 xml2_1.3.2          tools_4.0.3         data.table_1.13.6  
[43] hms_1.0.0           mitools_2.4         lifecycle_1.0.0     flextable_0.6.3     munsell_0.5.0       reprex_0.3.0        zip_2.1.1           callr_3.5.1         compiler_4.0.3      systemfonts_0.3.2   rlang_0.4.10        rstudioapi_0.13     base64enc_0.1-3     labeling_0.4.2     
[57] rmarkdown_2.7       gtable_0.3.0        DBI_1.1.1           R6_2.5.0            lubridate_1.7.9.2   stringi_1.5.3       hugodown_0.0.0.9000 Rcpp_1.0.6          vctrs_0.3.6         dbplyr_2.0.0        tidyselect_1.1.0    xfun_0.21          

References

Kish, Leslie. 1992. “Weighting for Unequal Pi.”

Lumley, Thomas. 2020. “Survey: Analysis of Complex Survey Samples.”

Pew Research. 2018. “How Different Weighting Methods Work.” Pew Research Center Methods.

R Core Team. 2019. “R: A Language and Environment for Statistical Computing.” Vienna, Austria: R Foundation for Statistical Computing.

Alberto Stefanelli
Alberto Stefanelli
PhD Candidate, Data Scientist, Consultant

Public opinion, voting behaviour, attitudes towards democracy, simulations, and R stats. But mostly noise.