A SIMULATION STUDY OF ESTIMATORS IN STRATIFIED PROPORTIONAL HAZARDS MODELS
RISK EVALUATION AFTER HEART VALVE REPLACEMENT BY SAS PROC PHREG
SAS by Stevy
Monday, January 11, 2016
Thursday, January 7, 2016
Everything to know about Clinical Trails
Clinical Trails
A clinical trial (also clinical research) is a research study in human volunteers to answer specific health questions. Carefully conducted clinical trials are the fastest and safest ways to find treatments that work in people and ways to improve health. Interventional trials determine whether experimental treatments or new ways of using known therapies are safe and effective under controlled environments. Observational trials address health issues in large groups of people of populations in natural settings.
Protocol
All clinical trials must be conducted according to strict scientific and ethical principles. Every clinical trial must have a protocol, or action plan that describes what will be done in the study, how it will be conducted, and why each part of the study is necessary - including details such as the criteria for patient participation, the schedule of tests, procedures, and medications, and the length of the study.Phases of Clinical Trials
Clinical trials are conducted in a series of steps, called phases - each phase is designed to answer a separate research question.Phase I: Researchers test a new drug or treatment in a small group of people for the first time to evaluate its safety, determine a safe dosage range, and identify side effects.
Phase II: The drug or treatment is given to a larger group of people to see if it is effective and to further evaluate its safety.
Phase III: The drug or treatment is given to large groups of people to confirm its effectiveness, monitor side effects, compare it to commonly used treatments, and collect information that will allow the drug or treatment to be used safely.
Phase IV: Studies are done after the drug or treatment has been marketed to gather information on the drug's effect in various populations and any side effects associated with long-term use.
Below are the links to papers published in various SAS User Group Meetings which are very informative to any SAS programmer: novice or seasoned
Clinical Trails Basics
SAS Programmer to Clinical SAS Programmer
Clinical Programming for Novice
Clinical Trials terminology for SAS Programmers
Managing Clinical Trials Data using SAS Software
Clinical Trial Terminology
Clinical Study Design and Methods Terminology
Clinical Trial Terms
Annotate CRF
Annotate Case Report Form Automation System
A Regular Language: The Annotated Case Report Form
Table, Listing and Graphs
Pharmaceutical Programming: From CRFs to Tables, Listings and Graphs, a process overview with real world examples
Creating Tables or Listings with a Zero-Record SAS® Data Set -- Basic Program Structure and Three Simple Techniques
Clinical Trial Reporting Using SAS/GRAPH SG Procedures
Show Your Graphs and Tables at Their Best on the Web with ODS
Creating Clinical Trial Summary Tables Containing P-Values:A Practical Approach Using Standard SAS Macros
Clinical Trials Validation
The 5 Most Important Clinical SAS Programming Validation Steps
A Simple Solution for Managing the Validation of SAS Programs That Support Regulatory Submissions
Clinical Trials Submissions
Data Definition Tables – Definition and Automation
Acknowledgements:
U.S National Library of Medicine
Wednesday, December 2, 2015
Descriptive Statistics in SAS (Part -1) - PROC MEANS and PROC SUMMARY
Data Source: 2015 Google Data for Unemployment
The most commonly used SAS Statistical procedures are Means, Summary, Frequency and Univariate we will take a detailed look at each of them.
Part 1- PROC MEANS
Syntax:
PROC MEANS <DATA=SAS-data-set>
<statistic-keyword(s)><option(s)>;
Run:
where
PROC MEANS prints the n-count (number of non missing values), the mean, the standard deviation, and the minimum and maximum values of every numeric variable in a data set. You may not always want this default statistics produced by PROC MEANS, so you can specific other statistic-keywords. Statistic-keywords that can be used with PROC MEANS are:
Descriptive Statistics
Quantile Statistics
Hypothesis Testing
We will now select only max value of variables January_Employment, February_Employment, March_Employment and Total_Quarterly_Wages. Maxdecimal specifies the maximum number of decimal places in result. To produce separate analyses of grouped observations, add a CLASS statement to the MEANS procedure. PROC MEANS does not generate statistics for CLASS variables, because their values are used only to categorize data. CLASS variables can be either character or numeric. A BY statement can also be used but unlike CLASS Statement By statement requires data to be in sorted order or indexed in order. Also it will produce different
A BY statement can also be used but unlike CLASS Statement By statement requires data to be in sorted order or indexed in order. Also it will produce different output than class. As can be seen below BY statement produces separate table for each BY variable.
A summarized output data set can be created by using PROC SUMMARY. When
you use PROC SUMMARY, you use the same code to produce the output data
set that you would use with PROC MEANS.
The difference between the two procedures is that PROC MEANS produces a report by default (remember that you can use the NOPRINT option to suppress the default report). By contrast, to produce a report in PROC SUMMARY, you must include a PRINT option in the PROC SUMMARY statement.
The most commonly used SAS Statistical procedures are Means, Summary, Frequency and Univariate we will take a detailed look at each of them.
Part 1- PROC MEANS
Syntax:
PROC MEANS <DATA=SAS-data-set>
<statistic-keyword(s)><option(s)>;
Run:
where
- SAS-data-set is the name of the data set to be used
- statistic-keyword(s) specify the statistics to compute
- option(s) control the content, analysis, and appearance of output
PROC MEANS prints the n-count (number of non missing values), the mean, the standard deviation, and the minimum and maximum values of every numeric variable in a data set. You may not always want this default statistics produced by PROC MEANS, so you can specific other statistic-keywords. Statistic-keywords that can be used with PROC MEANS are:
Descriptive Statistics
Keyword | Description |
CLM | Two-sided confidence limit for the mean |
CSS | Corrected sum of squares |
CV | Coefficient of variation |
KURTOSIS | Kurtosis |
LCLM | One-sided confidence limit below the mean |
MAX | Maximum value |
MEAN | Average |
MIN | Minimum value |
N | Number of observations with nonmissing values |
NMISS | Number of observations with missing values |
RANGE | Range |
SKEWNESS | Skewness |
STDDEV / STD | Standard deviation |
STDERR | Standard error of the mean |
SUM | Sum |
SUMWGT | Sum of the Weight variable values. |
UCLM | One-sided confidence limit above the mean |
USS | Uncorrected sum of squares |
VAR | Variance |
Quantile Statistics
Keyword | Description |
MEDIAN / P50 | Median or 50th percentile |
P1 | 1st percentile |
P5 | 5th percentile |
P10 | 10th percentile |
Q1 / P25 | Lower quartile or 25th percentile |
Q3 / P75 | Upper quartile or 75th percentile |
P90 | 90th percentile |
P95 | 95th percentile |
P99 | 99th percentile |
QRANGE | Difference between upper and lower quartiles: Q3-Q1 |
Hypothesis Testing
Keyword | Description |
PROBT | Probability of a greater absolute value for the t value |
T | Student's t for testing the hypothesis that the population mean is 0 |
Let us see an example:
Write a program to import excel file, know the content of dataset and produce the mean of numeric variables in the data set work.unemployment
The unemployment dataset contains 62000 observations but we calculated the means of first 10 observation by using obs option. As can be seen, by use of PROC CONTENTS statement we get to know the type of variable (char/num), a simple Means Procedure will produce n-count (number of nonmissing values), the mean, the standard deviation and the minimum and maximum values of all the 8 numeric variable present in the data set
Write a Proc Means using other descriptive statistics like max and maxdec. Use var statement to limit the number of variables and also use class statement to group the observation.
We will now select only max value of variables January_Employment, February_Employment, March_Employment and Total_Quarterly_Wages. Maxdecimal specifies the maximum number of decimal places in result. To produce separate analyses of grouped observations, add a CLASS statement to the MEANS procedure. PROC MEANS does not generate statistics for CLASS variables, because their values are used only to categorize data. CLASS variables can be either character or numeric. A BY statement can also be used but unlike CLASS Statement By statement requires data to be in sorted order or indexed in order. Also it will produce different
A BY statement can also be used but unlike CLASS Statement By statement requires data to be in sorted order or indexed in order. Also it will produce different output than class. As can be seen below BY statement produces separate table for each BY variable.
The difference between the two procedures is that PROC MEANS produces a report by default (remember that you can use the NOPRINT option to suppress the default report). By contrast, to produce a report in PROC SUMMARY, you must include a PRINT option in the PROC SUMMARY statement.
Thursday, October 29, 2015
String extraction using SAS Scan and Substr functions
In my last blog we saw string extraction using perl function. Today we are going to extract and concatenate character string using SAS Scan and Substr functions.
Syntax:
SCAN(argument,n,<delimiters>)
where
argument specifies the character variable or expression to scan. n specifies which word to read. delimiters are special characters that must be enclosed in single quotation marks (' '). If you do not specify delimiters, default delimiters are used. SUBSTR(argument,position,<n>) where argument specifies the character variable or expression to scan. position is the character position to start from. n specifies the number of characters to extract. If n is omitted, all remaining characters are included in the substring. Let us see an example:
In the dataset work.Company there are four columns Name, Age, Sex and SSN.
We wants to create a new data set employeeID with a column named ID which contains unique Identification number for the employees based on their last name and last four digits of their SSN. Company
g
data employeeID; set company; ID = (scan(name, 1, ","))!!(substr(SSN,8,4)); run; proc print data=employeeID; run; Explanation: scan(name, 1, ",") name is the character variable to scan 1 is the position of the word to read "," is the delimiter substr(SSN,8,4) SSN is the character variable to scan. 8 is the character position to start from. 4 specifies the number of characters to extract Result
|
---|
String Extraction Using Perl Regular Expressions (PRX) in SAS
The PRXMATCH function searches source with the perl-regular-expression and returns the position at which the string begins. If there is no match, PRXMATCH returns a zero.
syntax : PRXMATCH (perl-regular-expression, source)
Suppose we have a list of companies and their addresses :
7‑Eleven, Inc.community relations department dallas, TX 75221-0711
20 Century Fox 10201 pico blvd, los angeles, CA 90064
APACHE prime four business park kingswells scotland, United Kingdom
We wish to extract the names of the companies. The names of the companies starts with either a upper-case letters or numbers while their addresses are in the lower-case. To do so we are going to the string extraction with the help of PRXMATCH Function.
Explanaton:
prxmatch ('/\b[a-z]\w*\b/')
In this example we are using forward slashes (/) as perl dilimiters.
\b is word boundary (a space or end-of-line)
[a-z] matches lower-case letters
\w matches any word character (upper- and lowercase letters, blank and underscore)
* matches the previous subexpression zero or more times
As can be seen in the result PRXMATCH function helped us to extract the names of the companies.
Subscribe to:
Posts (Atom)