Please note that this package requires the installation of R version 3.3 or higher.
If you do not already have an updated version of R installed on your computer, you can find instructions on how to download the latest version here.
Also note that this package only includes the bit-vector implementation of FLAME, so it cannot be applied to database management systems.

FLAME should be downloaded from the author's Github using the following command:

Copied to clipboard!

`devtools::install_github('https://github.com/vittorioorlandi/FLAME')`

Input data must be stored in a *data frame*, which must contain covariates and treatment, and may contain an outcome column.
Covariates are assumed to be **categorical** and will be coerced to factors, though they may be passed as either factors or numerics.
If you wish to use continuous covariates for matching, they should be binned prior to being passed to FLAME.
Treatment must be denoted by a logical or binary numeric column. The outcome column, if supplied, will be treated as continuous if numeric,
as binary if a two-level factor or numeric with two unique values, and as multi-class if a factor with more than two levels.
If no outcome column is provided, matching will still be done, but CATEs will not be estimated. They will also not be estimated if the
outcome is passed as a factor.
Below is a sample dataset satisfying the format requirements:

x_1 | x_2 | ... | x_m | treated | outcome |
---|---|---|---|---|---|

3 | 0 | ... | 4 | 1 | 7.76 |

0 | 2 | ... | 1 | 0 | 5 |

... | ... | ... | ... | ... | ... |

0 | 6 | ... | 2 | 1 | 4.4 |

To generate sample data for exploring FLAMEs functionality, use the function gen_data as shown below.
Remember to load the 'FLAME' package as shown in line 1 before calling any of the functions discussed in this section.
This example generates a data frame with n = 250 units and p = 5 covariates:
To run the algorithm, use the FLAME function as shown in line 3. The required data parameter can either be
a path to a .csv file or a dataframe. In this example, a .csv file path is used:
The object FLAME_out is a list of six entries:

To find the matched groups of particular units after running FLAME, use the function MG as shown below.
In this example, the function would return the matched groups of units 1 and 2:
To find the CATEs of particular units, use the function CATE as shown below.
In this example, the function would return the matched groups of units 1 and 2:
To find the average treatment effect (ATE) or average treatment effect on the treated (ATT), use the functions ATE
and ATT, respectively, as shown below:

library('FLAME') data <-gen_data(n = 250, p = 5)

library('FLAME') FLAME_out <- FLAME(data = "data.csv", treated_column_name="treated", outcome_column_name="outcome") print(FLAME_out$data)

FLAME_out |
a data frame containing the original data with an extra logical column denoting whether a unit was matched and an extra numeric column denoting how many times a unit was matched. The covariates that each unit was not matched on are denoted with asterisks. |

FLAME_ |
a list of every matched group formed by the algorithm. |

FLAME_ |
a vector containing the conditional average treatment effect (CATE) for every matched group formed. |

FLAME_ |
a list corresponding to MGs that gives the covariates, and their values, on which units in each matched group were matched. |

FLAME_ |
a list containing the covariates that were used for matched on each iteration of the algorithm. |

FLAME_ |
a vector of the covariate dropped at each iteration. |

MG(c(1,2), FLAME_out)

CATE(c(1,2), FLAME_out)

ATE(FLAME_out = FLAME_out)ATT(FLAME_out = FLAME_out)

data:file, Dataframe, required |
The data to be matched. |

holdout:numeric, file, Dataframe, optional (default = 0.1) |
Holdout data used to compute predictive error. If a numeric scalar between 0 and 1 is provided,
that proportion of data will be used.
Otherwise, if a file path or dataframe is provided, that dataset will serve as the holdout data. |

Cnumeric, optional (default = 0.1) |
Tradeoff parameter between predictive error and balancing factor. A greater value prioritizes more matches while a lower value prioritizes not dropping important covariates. Must be positive scalar. |

treated_string, optional (default = 'treated') |
The name of the column which specifies whether a unit is treated or control. |

outcome_string, optional (default = 'outcome') |
The name of the column which specifies each unit outcome. |

PE_string, optional (default = 'ridge') |
The method used to compute PE. If 'ridge', perform cross-validation using glmnet::cv.glmnet with default parameters. If 'xgb', perform cross-validation using xgboost::xgb.cv with a wide range of parameter values and determines best values with respect to root-mean-square error (RMSE) (for continuous outcomes) or missclassification rate (for binary or multiclass outcomes). |

user_function, optional (default = NULL): |
Optional function to be used instead of those provided for in PE_method to fit unit outcomes from the covariates.
Must take in a matrix of covariates as its first argument and a vector of outcomes as its second argument. |

user_list, optional (default = NULL) |
A named list of optional parameters to be used by user_ |

user_function, optional (default = NULL) |
Optional function to be used instead of the default predict method for generating predictions from the
output of user_. Must take in an object of the type returned by user_
as its first argument and a matrix of values for which to generate predictions as its second argument. |

user_list, optional (default = NULL) |
A named list of optional parameters to be used by user_ |

replace:logical scalar, optional (default = FALSE): |
Specifies whether the same unit can be matched multiple times on different sets of covariates. If True, balancing factor is computed by dividing by the total number of treatment/control units instead of the the number of unmatched treatment/control units. |

verbose:integer, optional (default = 2): |
Controls how progress is displayed while the algorithm is running. If 0, prints nothing. If 1, prints stopping condition. If 2, prints the iteration number, the number of units left to match on every 5th iteration, and the stopping condition. |

return_logical scalar, optional (default = FALSE): |
If True, the predictive error at each iteration will be returned. |

return_logical scalar, optional (default = FALSE): |
If True, the balancing factor at each iteration will be returned. |

early_integer, optional (default = Inf): |
Specifies the number of iterations after which to hard-stop the algorithm. If 0, one round of exact matching is performed before stopping. |

early_numeric, optional (default = 0.25) |
Nonnegative numeric denoting the maximum acceptable percent change in predictive error relative to that computed using all covariates before the algorithm will stop iterating. Default corresponds to 25%. |

early_numeric, optional (default = 0) |
Minimum acceptable proportion of unmatched control units after which the algorithm will stop iterating. Must be between 0 and 1. |

early_numeric, optional (default = 0) |
Minimum acceptable proportion of unmatched treatment units after which the algorithm will stop iterating. Must be between 0 and 1. |

early_numeric, optional (default = Inf) |
Maximum acceptable predictive error. If FLAME attempts to drop a covariate which would increase PE above this threshold, it will stop iterating. |

early_numeric, optional (default = 0) |
Minimum acceptable balancing factor. If FLAME attempts to drop a covariate which would decrease BF below this threshold, it will stop iterating. |

missing_integer, optional (default = 0) |
If 0, assume no missingness in matching data. If 1, drop units with missingness from matching data. If 2, impute missing values using mice::mice on matching dataset for the number of imputations specified by missing_.If 3, do not match units on covariates that are missing. |

missing_integer, optional (default = 0): |
If 0, assume no missingness in holdout data. If 1, drop units with missingness from holdout data. If 2, impute missing values mice::mice on holdout dataset for the number of imputations specified by missing_. |

missing_integer, optional (default = 5) |
If missing_=2, specifies the number of imputations on the matching set. |

missing_integer, optional (default = 5) |
If missing_=2, specifies the number of imputations on the holdout set. |

impute_logical scalar, optional (default = TRUE) |
If True, treatment assignment is used to impute covariates when missing_data=2 or missing_=2 |

impute_logical scalar, optional (default = FALSE) |
If True, outcome information is used to impute covariates when missing_data=2 or missing_=2 |

#generates toy data
**gen_data**(n = 250, p = 5, write = FALSE, path = getwd(), filename = "FLAME.csv")
#returns matched groups for specified units
**MG**(units, FLAME_out, multiple = FALSE, index_only = FALSE)
#returns CATEs for specified units
**CATE**(units, FLAME_out, multiple = FALSE)
#returns ATE for matched dataset
**ATE**(FLAME_out)
#returns ATT for matched dataset
**ATT**(FLAME_out)

n:integer, optional (default = 250) |
Number units desired in the dataset created by gen_data. |

p:integer, optional (default = 5) |
Number units desired in the dataset created by gen_data. Must be greater than 2. |

write:logical scalar, optional (default = FALSE) |
Specifies whether the output of gen_data is stored as a .csv file according to the parameters path and filename. |

path:string, optional (default = getwd( )) |
If write is TRUE, specifies the location path of the file created by gen_data. |

filename:string, optional (default = 'FLAME.csv') |
If write is TRUE, specifies the name of the file created by gen_data. |

units:numeric vector, required |
Vector of indices for the units of interest for the functions MG and CATE. |

FLAME_Dataframe, required |
The output of a call to FLAME. |

multiple:logical scalar, optional (default = FALSE) |
If FALSE (default), then the functions MG and CATE will only return objects pertaining to a unit's main matched group. If TRUE, the aforementioned functions will return objects pertaining to every matched group containing a specified unit (only relevant if replace=TRUE). |

index_logical scalar, optional (default = FALSE) |
If TRUE, the function MG will return only the indices of the units in each matched group. |