Since R version 2.12, there's a droplevels()
function.
levels(droplevels(subdf$letters))
ID : 3660
viewed : 93
Tags : rdataframer-factorr-faqr
95
Since R version 2.12, there's a droplevels()
function.
levels(droplevels(subdf$letters))
83
All you should have to do is to apply factor() to your variable again after subsetting:
> subdf$letters [1] a b c Levels: a b c d e subdf$letters <- factor(subdf$letters) > subdf$letters [1] a b c Levels: a b c
EDIT
From the factor page example:
factor(ff) # drops the levels that do not occur
For dropping levels from all factor columns in a dataframe, you can use:
subdf <- subset(df, numbers <= 3) subdf[] <- lapply(subdf, function(x) if(is.factor(x)) factor(x) else x)
73
If you don't want this behaviour, don't use factors, use character vectors instead. I think this makes more sense than patching things up afterwards. Try the following before loading your data with read.table
or read.csv
:
options(stringsAsFactors = FALSE)
The disadvantage is that you're restricted to alphabetical ordering. (reorder is your friend for plots)
67
It is a known issue, and one possible remedy is provided by drop.levels()
in the gdata package where your example becomes
> drop.levels(subdf) letters numbers 1 a 1 2 b 2 3 c 3 > levels(drop.levels(subdf)$letters) [1] "a" "b" "c"
There is also the dropUnusedLevels
function in the Hmisc package. However, it only works by altering the subset operator [
and is not applicable here.
As a corollary, a direct approach on a per-column basis is a simple as.factor(as.character(data))
:
> levels(subdf$letters) [1] "a" "b" "c" "d" "e" > subdf$letters <- as.factor(as.character(subdf$letters)) > levels(subdf$letters) [1] "a" "b" "c"
50
Another way of doing the same but with dplyr
library(dplyr) subdf <- df %>% filter(numbers <= 3) %>% droplevels() str(subdf)
Edit:
Also Works ! Thanks to agenis
subdf <- df %>% filter(numbers <= 3) %>% droplevels levels(subdf$letters)