0% found this document useful (0 votes)
83 views3 pages

Cross Table Function in R

Uploaded by

RAHUL SHARMA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
83 views3 pages

Cross Table Function in R

Uploaded by

RAHUL SHARMA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

Faculty of : FCE Program: [Link] Class/Section: Sem V, Sec.

Date:
A,B,C(AIDS)
Name of Faculty: Seema Kaloria Name of Course: R Programming Code: BADCCE5104
Flat Table Objects in R
A flat table in R is most commonly represented as a data frame. It stores data in rows and columns, and is
used to analyze and manipulate tabular data.
Example of Flat Table Object:

# Create a flat table (data frame)


flat_table <- [Link](
id = 1:5,
name = c("Alice", "Bob", "Charlie", "David", "Eva"),
age = c(25, 30, 35, 40, 45),
gender = c("F", "M", "M", "M", "F")
)
print(flat_table)

Output:

id name age gender

1 1 Alice 25 F

2 2 Bob 30 M

3 3 Charlie 35 M

4 4 David 40 M

5 5 Eva 45 F
2. Cross Tables (Contingency Tables)

A cross table (also known as a contingency table) is used to summarize categorical data and show the
relationship between two or more categorical variables. In R, the table() function is often used to create these
tables.

Example: Cross Tabulation of Gender and Age Group

# Adding age groups to the flat table


flat_table$age_group <- cut(flat_table$age, breaks=c(20, 30, 40, 50), labels=c("20-30", "30-40", "40-50"))

# Cross table (contingency table) for gender and age group


cross_table <- table(flat_table$gender, flat_table$age_group)
print(cross_table)
Session 2024-25
Output:

20-30 30-40 40-50


F 1 0 1
M 1 2 0

This cross table shows how many males and females fall into different age groups.
3. Testing Cross-Tabulation

You can test if there’s a significant relationship between two categorical variables in a cross table using the
Chi-squared test.

Example: Chi-squared Test on Cross Table

# Perform Chi-squared test on the cross table


chisq_test <- [Link](cross_table)
print(chisq_test)

Output:
Pearson's Chi-squared test
data: cross_table
X-squared = 1.6667, df = 2, p-value = 0.4346

This will give you the p-value and test statistic, allowing you to assess if the variables are independent or
associated.
4. Recreating Original Data from a Contingency Table

To recreate the original data from a contingency table, you need to expand it back into its raw format,
typically using functions like rep() in combination with [Link]().

Example:

# Example contingency table (for simplicity)

contingency_table <- matrix(c(2, 3, 1, 4), nrow=2, dimnames=list(Gender=c("F", "M"), AgeGroup=c("20-


30", "30-40")))

# Recreate original data from the contingency table


recreated_data <- [Link]([Link](Gender = c("F", "M"), AgeGroup = c("20-30", "30-40")))
recreated_data$Count <- c(contingency_table)
recreated_data <- recreated_data[rep(seq_len(nrow(recreated_data)), recreated_data$Count), -3]
print(recreated_data)

Output:

Gender AgeGroup

1 F 20-30
2 F 20-30
3 F 30-40
4 F 30-40
5 F 30-40
6 M 20-30
7 M 30-40
Session 2024-25
8 M 30-40
9 M 30-40
10 M 30-40

This shows the expanded dataset where each row represents an observation from the contingency table.
5. More Advanced Testing and Operations

For more advanced operations, you can use the dplyr package for group summaries and reshape2 or tidyr for
reshaping data. For example:

Using dplyr for summarization:

library(dplyr)
flat_table %>% group_by(gender, age_group) %>% summarise(count = n())

Output:

# A tibble: 3 × 3

# Groups: gender [2]

gender age_group count

<chr> <fct> <int>

1F 20-30 1

2F 40-50 1

3M 20-30 1

4M 30-40 2

Reshaping Data with tidyr:

library(tidyr)
pivot_table <- pivot_wider(flat_table, names_from = age_group, values_from = age)

# A tibble: 5 × 5
id name gender `20-30` `30-40` `40-50`
<int> <chr> <chr> <dbl> <dbl> <dbl>
1 1 Alice F 25 NA NA
2 2 Bob M 30 NA NA
3 3 Charlie M NA 35 NA
4 4 David M NA 40 NA
5 5 Eva F NA NA 45

Session 2024-25

You might also like