Wednesday, March 27, 2013

Structure as a graph


> # Assignment 1: Create 3 vectors, x, y, z and choose any random values for them, ensuring they are of equal length,T<- cbind(x,y,z)
> #Create 3 dimensional plot of the same
> sample<-rnorm(50,25,6)
> sample
 [1] 23.10381 25.85777 22.04959 43.53180 12.11174 37.23922 38.92648 22.77181 17.80844 30.41365 32.32586 37.09651 24.55097 19.85470 31.01534 29.70007 31.72610 22.26199
[19] 19.85826 36.94503 23.50247 18.00116 24.50004 27.57822 20.34054 17.32243 30.26892 19.03535 16.14514 28.81016 29.45099 23.10639 25.49178 35.95906 19.35419 23.04064
[37] 25.20819 18.83031 30.75433 19.14759 28.11077 25.91251 28.03618 33.34057 30.19792 25.07813 25.08856 26.12123 24.15002 22.09888
> x<-sample(sample,10)
> y<-sample(sample,10)
> z<-sample(sample,10)
> x
 [1] 22.77181 31.01534 27.57822 22.09888 24.15002 19.85826 28.03618 17.32243 31.72610 19.35419
> y
 [1] 19.85826 26.12123 30.19792 23.10639 31.72610 22.04959 24.55097 27.57822 28.03618 31.01534
> z
 [1] 28.11077 31.72610 28.81016 12.11174 30.41365 23.50247 24.55097 22.04959 29.45099 30.26892
> T<-cbind(x,y,z)
> T
             x        y        z
 [1,] 22.77181 19.85826 28.11077
 [2,] 31.01534 26.12123 31.72610
 [3,] 27.57822 30.19792 28.81016
 [4,] 22.09888 23.10639 12.11174
 [5,] 24.15002 31.72610 30.41365
 [6,] 19.85826 22.04959 23.50247
 [7,] 28.03618 24.55097 24.55097
 [8,] 17.32243 27.57822 22.04959
 [9,] 31.72610 28.03618 29.45099
[10,] 19.35419 31.01534 30.26892
> plot3d(T)
plot3d(T,col=rainbow(1000))
plot3d(T,col=rainbow(1000),type='s')
 > #Assignment no 2: Create 2 random variables Create 3 plots:
> #1. X-Y ,X-Y|Z (introducing a variable z and cbind it to x and y with 5 diff categories)
> x<-rnorm(1500,100,10)
> y<-rnorm(1500,85,5)
> z1<-sample(letters,5)
> z2<-sample(z1,1500,replace=TRUE)
> z<-as.factor(z2)
> t<-cbind(x,y,z)
> qplot(x,y)
qplot(x,z)

qplot(x,z,alpha=I(1/10))
qplot(x,y,geom=c("point","smooth"))
 qplot(x,y,colour=z)
qplot(log(x),log(y),colour=z)







Thursday, March 21, 2013

Analyze your Finance

ReadyRatios Financial Analysis

ReadyRatios is a program (web service, SaaS) intended to carry out an intellectual analysis of a company’s financial position based on data from its financial statements.
The main feature of the system is that individuals should not take part in the analysis. All that has to be done is to enter the data of financial statements (prepared according to the IFRS or US GAAP) and to receive the results of the analysis, which does not differ from the analysis made by a professional analyst.

Software features:

 

Knowledge of financial analysis is not required:

  ReadyRatios makes all analytical jobs, beginning from the moment when the user enters the data, and finishing when the analytical report is ready.

Highly usable web-interface:

  The program is based on web-technologies and can be opened with the help of your favourite browser. It is supported with the following browsers: Internet Explorer 6+, Firefox 3+, Google Chrome 7+ and Opera 9+.

 

The program provides the user with a complete report

  Calculate over 40 coefficients and rates, but also to describe the calculated results, to make conclusions and to give an opinion on the financial state of a company. When analysing, the program uses unique scoring methods.


Multiple choice conditional texts

  Multiple choice conditional texts make each report unique. The texts of reports are never repeated, even if the reports are made based on the same data (the probability of repetition is close to zero).


Analysis of reports in dynamics for several periods

  Can get an analysis report for both one reporting period and also for several periods (annually, biannually, quarterly or monthly).


Flexible approach to making tables

  Do not have to glue together pages with tables any more when analysing data over many years. The program has a feature which allows reducing wide tables without significant loss of their informational value.


It is easy and convenient to enter the initial data

  When data is entered into the form, they are given in a financial report. Preliminary or additional calculations are not required .



Information can be saved on the data base

  The user can save the initial data and prepared reports both on their computer and on the server of ReadyRatios. The last variant provides you with fast and convenient access to information for further analysis. To use this feature, the user should log in to the web-site of ReadyRatios .



Great opportunities for professionals

  Want to set parameters of the analysis according to your requirements, you can easily:
  • change any formula or add your own
  • change qualitative properties of the rate (i.e. range of satisfactory and unsatisfactory values), including ones for different industries
  • set a report template (add or remove desirable or undesirable rates)
The program takes into account the settings given by the user, but keeps the analysis absolutely autonomous at the same time.


User report settings

  The user can limit the report and leave only the required sections, select parameters on displaying tables and text, as well as a set level of criticism. Setting a criticism level can eliminate critical remarks and comments in the report text (if the financial state of the company is poor). At the same time informative values and conclusions will not be affected.

Some screen shots are as follows:










 




























 

 

 

Thursday, March 14, 2013

Panel Data Analysis

Panel data analysis of "produc" data using three models
1.Pooled
2.Fixed
3.Random
Analyzing which model is best suited:
       pFtest : for determining between fixed and pooled
       plmtest : for determining between pooled and random
       phtest: for determining between random and fixed

Commands:
Loading data:
> data(Produc , package ="plm")
> head(Produc)




Pooled Affect Model

> pool <- plm(log(pcap)~ log(hwy) + log(water) + log(util) + log(pc) + log(gsp) + log(emp) + log(unemp) , data =Produc, model=("pooling"), index = c("state","year"))
> summary(pool)
 


Fixed Affect Model:
 
> fixed <- plm(log(pcap)~ log(hwy) + log(water) + log(util) + log(pc) + log(gsp) + log(emp) + log(unemp) , data =Produc, model=("within"), index = c("state","year"))
> summary(fixed)



Random Affect Model:
> random <- plm(log(pcap)~ log(hwy) + log(water) + log(util) + log(pc) + log(gsp) + log(emp) + log(unemp) , data =Produc, model=("random"), index = c("state","year"))
> summary(random)




Comparison
 
The comparison between the models would be a Hypothesis testing based on the following concept:
 
H0: Null Hypothesis: the individual index and time based params are all zero
H1: Alternate Hypothesis: atleast one of the index and time based params is non zero

Pooled vs Fixed
 
Null Hypothesis: Pooled Affect Model
Alternate Hypothesis : Fixed Affect Model
 
Command:
> pFtest(fixed,pool)
Result:
data:  log(pcap) ~ log(hwy) + log(water) + log(util) + log(pc) + log(gsp) +      log(emp) + log(unemp)
F = 56.6361, df1 = 47, df2 = 761, p-value < 2.2e-16
alternative hypothesis: significant effects
Since the p value is negligible so we reject the Null Hypothesis and hence Alternate hypothesis is accepted which is to accept Fixed Affect Model.
 
Pooled vs Random
 
Null Hypothesis: Pooled Affect Model
Alternate Hypothesis: Random Affect Model
 
Command :
> plmtest(pool)
 
Result:
 
        Lagrange Multiplier Test - (Honda)
data:  log(pcap) ~ log(hwy) + log(water) + log(util) + log(pc) + log(gsp) +      log(emp) + log(unemp)
normal = 57.1686, p-value < 2.2e-16
alternative hypothesis: significant effects
 
Since the p value is negligible so we reject the Null Hypothesis and hence Alternate hypothesis is accepted which is to accept Random Affect Model.
 
 
Random vs Fixed
 
Null Hypothesis: No Correlation . Random Affect Model
Alternate Hypothesis: Fixed Affect Model
 
Command:
 > phtest(fixed,random)
 
Result:
 
        Hausman Test
data:  log(pcap) ~ log(hwy) + log(water) + log(util) + log(pc) + log(gsp) +      log(emp) + log(unemp)
chisq = 93.546, df = 7, p-value < 2.2e-16
alternative hypothesis: one model is inconsistent
 
Since the p value is negligible so we reject the Null Hypothesis and hence Alternate hypothesis is accepted which is to accept Fixed Affect Model.
Conclusion:
Hence the Fixed model is most suited for panel data analysis of the data"Produc".
 
 








 

Wednesday, February 13, 2013

Stationary Time Series

#Assignment-1 Create log of the return data( way 1- log (st-st-1)/(st-1)
> #Historical volatility calculate.
> #Create acf plot for log(returns ) data and adf and interpret. NSE nifty index(from jan2012 to 31 jan 2013)
Program:
> z<-read.csv(file.choose(),header=T)
> closingprice<-z$Close
> closingprice.ts<-ts(closingprice,frequency=252)
> laggingtable<-cbind(closingprice.ts,lag(closingprice.ts,k=-1),closingprice.ts-lag(closingprice.ts,k=-1))
> Return<-(closingprice.ts-lag(closingprice.ts,k=-1))/lag(closingprice.ts,k=-1)
> Manipulate<-scale(Return)+10
> logreturn<-log(Manipulate)
> acf(logreturn)
From the figure it implies that the all the standard errors are within the 95% confidence interval and hence we can
say that the time series is stationary.
>T<-252^.5
>Historicalvolatility<-sd(Return)*T
> Historicalvolatility
[1] 0.1475815
> adf.test(logreturn)

        Augmented Dickey-Fuller Test

data:  logreturn
Dickey-Fuller = -5.656, Lag order = 6, p-value = 0.01
alternative hypothesis: stationary

Warning message:
In adf.test(logreturn) : p-value smaller than printed p-value

Since p-value is less than (1-.95) ,therefore we can say null hypothesis is rejected and hence the time series is stationary so data analysis can be done.






Thursday, February 7, 2013

Returns and Forecasting

Objective1: Find returns of NSE data>6months.having selected the 10th data as start and 95th data point as end.Also plot the assignment .

Solution:
Step 1: Read data  in the form of CSV file for the period 1/12/2011 to 5/02/2013
Command:
 z<-read.csv(file.choose(),header=T)

Step 2:Choose the Close column.
Command:
 close<-z$Close

Step 3:Vectorised the data i.e form a matrix of order 1X298 as 298 data points are available in close.
Command:
dim(close)<-c(1,298)

Step 4:Create time-series objects for close data from element (1,10 to1,95)
Command:
close.ts<-ts(close[1,10:95],deltat=1/252)
Step 5:Calculate difference between preceding and succeeding value
Command:
close.diff<-diff(close.ts)
Step 6: Calculate return :
Command:
return<-close.diff/lag(close.ts,k=-1)
final<-cbind(close.ts,close.diff,return)
Step 7: Plot
Command:
plot(return,main="Return from 10th to 95th")
plot(final,main="Data from 10th to 95, Difference, Return")

 Objective 2:1-700 data is available, Predict the data from 701-850, use the GLM estimation using LOGIT Analysis for the same

Step 1:Read data  in the form of CSV file

Command:
z<-read.csv(file.choose(),header=T)

Step 2:Check the dimension of z
Command
dim(z)


Step 3:Choose 1-700 data
Command

 new<-z[1:700,1:9]

Step 4:
Command
head(new)

Step 5:
Identify the factor and run the Logit regression
Command

 new$ed <- factor(new$ed)
 new.est<-glm(default ~ age + ed + employ + address + income, data=new, family ="binomial")
 summary(new.est)

Step 6
Prediction<-z[701:850,1:8]
 Prediction$ed<-factor(Prediction$ed)
 Prediction$prob<-predict(new.est, newdata =Prediction, type = "response")
 head(Prediction)





Tuesday, January 22, 2013

The Determinant and Annova way

DAY#3


ASSIGNMENT 1a:


Fit ‘lm’ and comment on the applicability of ‘lm’

Plot1: Residual vs Independent curve.

Plot2: Standard Residual vs independent curve.
> file<-read.csv(file.choose(),header=T)
> file
  mileage groove
1       0 394.33
2       4 329.50
3       8 291.00
4      12 255.17
5      16 229.33
6      20 204.83
7      24 179.00
8      28 163.83
9      32 150.33
> x<-file$groove
> x
[1] 394.33 329.50 291.00 255.17 229.33 204.83 179.00 163.83 150.33
> y<-file$mileage
> y
[1]  0  4  8 12 16 20 24 28 32
> reg1<-lm(y~x)
> res<-resid(reg1)
> res
         1          2          3          4          5          6          7          8          9
 3.6502499 -0.8322206 -1.8696280 -2.5576878 -1.9386386 -1.1442614 -0.5239038  1.4912269  3.7248633
> plot(x,res)
https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgRkcahdlxod5UhtUsYyprtsgAVOto8IJoC60-31yvfJCdMkMcr3MlfXr1BKPwhncjd3OFjOhbH2xNbDssKkk0UjXGpdwMBz05ACfxdiTrSuGgZ4OC_VHuJ7FKpzwunRNkmN0YSE2szlA/s1600/graph3a.jpg

 As the plot is parabolic, the regression cannot be performed.
Assignment 1 (b) -Alpha-Pluto Data
Fit ‘lm’ and comment on the applicability of ‘lm’.

Plot1: Residual vs Independent curve.

Plot2: Standard Residual vs independent curve.

Also do:

Qq plot
Qqline
> file<-read.csv(file.choose(),header=T)
> file
   alpha pluto
1  0.150    20
2  0.004     0
3  0.069    10
4  0.030     5
5  0.011     0
6  0.004     0
7  0.041     5
8  0.109    20
9  0.068    10
10 0.009     0
11 0.009     0
12 0.048    10
13 0.006     0
14 0.083    20
15 0.037     5
16 0.039     5
17 0.132    20
18 0.004     0
19 0.006     0
20 0.059    10
21 0.051    10
22 0.002     0
23 0.049     5
> x<-file$alpha
> y<-file$pluto
> x
 [1] 0.150 0.004 0.069 0.030 0.011 0.004 0.041 0.109 0.068 0.009 0.009 0.048
[13] 0.006 0.083 0.037 0.039 0.132 0.004 0.006 0.059 0.051 0.002 0.049
> y
 [1] 20  0 10  5  0  0  5 20 10  0  0 10  0 20  5  5 20  0  0 10 10  0  5
> reg1<-lm(y~x)
> res<-resid(reg1)
> res
         1          2          3          4          5          6          7
-4.2173758 -0.0643108 -0.8173877  0.6344584 -1.2223345 -0.0643108 -1.1852930
         8          9         10         11         12         13         14
 2.5653342 -0.6519557 -0.8914706 -0.8914706  2.6566833 -0.3951747  6.8665650
        15         16         17         18         19         20         21
-0.5235652 -0.8544291 -1.2396007 -0.0643108 -0.3951747  0.8369318  2.1603874
        22         23
 0.2665531 -2.5087486
> plot(x,res)


> qqnorm(res)
 
> qqline(res)
 

Assignment 2: Justify Null Hypothesis using ANOVA

> file<-read.csv(file.choose(),header=T)
> file

   Chair Comfort.Level Chair1
1      I             2      a
2      I             3      a
3      I             5      a
4      I             3      a
5      I             2      a
6      I             3      a
7     II             5      b
8     II             4      b
9     II             5      b
10    II             4      b
11    II             1      b
12    II             3      b
13   III             3      c
14   III             4      c
15   III             4      c
16   III             5      c
17   III             1      c
18   III             2      c
> file.anova<-aov(file$Comfort.Level~file$Chair1)
> summary(file.anova)

            Df Sum Sq Mean Sq F value Pr(>F)
file$Chair1  2  1.444  0.7222   0.385  0.687

Conclusion: P Value  = 0.687

Since, the p - value is high, we cannot reject the null hypothesis. Thus we can say that all the types of chairs are not different.

Tuesday, January 15, 2013

Day-2 The Regression way

 The Day started with building concept to go for regression analysis. The platform is set with problems using matrix ,selecting columns and rows, multiplying of matrices and  finally regression and distribution.

Assignment-1
Assignment 1:
Create 2 matrices of 3*3 and select 1 column in matrix 1 and matrix 2 merge them into another matrix using cbind command.
command:
z1<-c(1:9)
dim(z1)<-c(3,3)
z2<-c(10:18)
dim(z2)<-c(3,3)
x<-z1[,3]
y<-z2[,1]
z3<-cbind(x,y)
z3
Assignment 2:
multiply matrix 1 and matrix 2

z1%*%z2

Assignment 3:
Read historical data of indices from NSE site from Dec 1 2012 to Dec 31 2012. Find Regression and Residuals.
 To read NSE file 
nse<-read.csv(file.choose(),header=T)
reg<-lm(high~open,data=nse)
To find residuals
residuals(reg)
Assignment 4:
Generate a Normal distribution data and plot it.
x=seq(70,130,length=200)
y=dnorm(x,mean=100,sd=10)
plot(x,y)

Tuesday, January 8, 2013

The Analyst Way

  

Assignment 1: Plotting Histogram

Assignment 2: Plotting point and line diagram 

                           

                                                 

                                           Assignment 3: Scatter Diagram for High and Low data

Assignment 4: Volatility of Max and Min of share value

Code:
>zcol3<-z[,3]
>zcol4<-z[,4]
mergeddata<-c(zcol3,zcol4)
> summary(mergeddata)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
   4888    5660    5723    5758    5884    6021
> range(mergeddata)
[1] 4888.20 6020.75