This repository constains the set of scripts used on the paper "Automated Selection of Network Traffic Models through Bayesian and Akaike Information Criteria", and a tutorial of how to reproduce the experiments with the same or new pcap files.
Sumary:
- Dependencies
- Setup Enviroment
- Tutorial
- Files documentation
To run these scripts, the follow dependencies are required:
sudo apt install python3-pip
pip install -U matplotlib --user
pip3 install numpy
sudo apt-get install python3-tk
To install and setup octave, use the follow commands:
# add repository
sudo add-apt-repository ppa:octave/stable
sudo apt-get update
# install Octave
sudo apt-get install octave
sudo apt-get install liboctave-dev
# other packages dependencies
sudo apt-get install gnuplot epstool transfig pstoedit
Install statistic packeges on Octave to run the simulations. Run the follow command to start Octave CLI:
octave-cli
Inside Octave CLI, run the folloe commands:
octave> pkg -forge install io
octave> pkg -forge install statistics
After running these commands, a directory on home called .config/octave will appear. But it may have some ownership/access problems. To solve it, run this command on Shell terminal:
sudo chown $USER ~/.config/octave/qt-settings
The pcap files used on these experiments are provided on the repository https://github.com/AndersonPaschoalon/Pcaps. To run the tests, we recomend clonning this repository (or its code-only version) and the Pcap repository side by side:
(root-dir)/
├── aic-bic-paper/
├── Pcaps/
To prepare the You can do this by using the follow commands:
mkdir aic-bic-tests
cd aic-bic-tests
git clone https://github.com/AndersonPaschoalon/Pcaps
git clone git clone https://github.com/AndersonPaschoalon/aic-bic-paper
To generate the pcap files:
./git-setup.sh --merge
After that, to clean-up the local repository (excludign part files), you may execute:
./git-setup.sh --rm
To run the simulations, use run.py. This is a script to automate and simplify the script execution, maintaining the consistency, without having to know inner details.
Runing run.py --help
we have an example:
./run.py <pcap_file> <simulation_name>
Eg.: ./run.py ../pcaps/wireshark-wiki_http.pcap wombat
The first argument must be the relative path, and the second the name of the simulation. A directory with the simulation nada will be created at plots/ directory. After that, to generate the figures, run:
./plot.py --simulation "plots/<simulation_name>"
Supose you have a pcap file the directory (relative to this one) ../Pcaps/wombat-test.pcap
. We may script the execution of the tests as below:
./run.py ../Pcaps/wombat-tests.pcap wombat
./plot.py --simulation "plots/wombat"
The command ./plot.py --paper
will also create some aditional plots for the paper.
To recreate all the plots on the paper, after creating the enviroment, we must execute:
./run.py Pcaps/skype.pcap skype
./run.py Pcaps/bigFlows.pcap bigFlows
./run.py Pcaps/equinix-1s.pcap equinix-1s
./run.py Pcaps/lanDiurnal.pcap lanDiurnal
./plots.py --simulation "./plots/skype/"
./plot.py --simulation "./plots/bigFlows/"
./plot.py --simulation "./plots/equinix-1s/"
./plot.py --simulation "./plots/lanDiurnal/"
./plot.py --paper
These are the set of scripts located at the dataProcessor directory, used by the run.py script.
- pcap-filter.sh : extract inter packet times from pcaps.
├── timerelative2timedelta.m: script used by pcap filter - dataProcessor.m: run simulations and stores the data on data/ directory.
├── adiff.m: calc the absolute difference
├── cdfCauchyPlot.m: create the values of a Cauchy CDF distribution, and plot in a figure
├── cdfExponentialPlot.m: create the values of a Exponential CDF distribution, and plot in a figure
├── cdfNormalPlot.m: create the values of a Normal CDF distribution, and plot in a figure
├── cdfWeibullPlot.m: create the values of a Weibull CDF distribution, and plot in a figure
├── cdfParetoPlot.m: create the values of a Pareto CDF distribution, and plot in a figure
├── cdfplot.m: create the values of a Cauchy CDF distribution, and plot in a figure
├── computeCost.m: compute cost for linear regression
├── cumulativeData.m: acumulates a vector
├── gradientDescent.m: gradient descendent algorithm
├── informationCriterion.m: calcs AIC or BIC of a given function and dataset
├── likehood_log.m: calcs the logarithm of the likelihood function
├── matrix2File.m: save matrix into a text file
├── empiricalCdf.m: eval empirical CDF
├── plotData.m: wrapper for plot x and y data
├── qqPlot.m: wrapper for qqplots on octave wraper
├── sameLength.m: ensure two vecters have the same size. If not, the bigget is truncated
├── setxlabels.m: set x tick labels on axis on figures
├── sff2File.m: vector to file
├── data/: place where dataProcessor.m saves the generated data
├── figures/: figures plotted by dataProcessor - calcCostFunction.py: aux script, this script calcs the cost function for the simulated data and saves in the file costFunction.dat.
- aicBicRelativeDiff.py: script to calc the relative difference between AIC and BIC.