Statistics for the Life and Social Sciences

STAD29/STA429/STA1007 Ken Butler

     
Course outline
Introduction to SAS on Mathlab
Lecture notes (last update Jan 10)
Assignments
Links to SAS documentation
Class SAS examples
 

Welcome to the home page for STA 1007 / STAD29. This is the place to look for all things course-related (notes, assignments, etc.) except for assignment marks, which will be on Blackboard.


  • Tue Apr 24 5:00pm: grades submitted for everyone.
  • Wed Apr 18 5:00pm: I've added some extra columns to Blackboard so that you can see where you stand. "Assgt average" is just that; the assgt percentage that will go into your final grade. For those of you in STA D29, that's all that's needed; your calculated *unofficial* final grade is in "d29 grade". There is also a column "1007 grade", which is not complete yet since I haven't marked the projects (tonight or tomorrow). Figure yourself about another 25 points on top of what's in "1007 grade".
  • Wed Apr 18 4:00pm: I have finished marking the final exams, and am just about to put the marks on Blackboard (so you can see how you did). I have to finish marking the projects before I can do the STA 1007 final grades, but I can do the D29 grades now.
  • Thu Apr 12 1:00pm: A reminder that the final exam is Monday April 16th at 9:00am in room IC 200.

    I can be on campus Friday morning if you have questions for me (but let me know if you want to see me, otherwise I'm staying home!). E-mail will reach me, but messages after about 2pm on Friday may not get a reply until Saturday night or Sunday.

  • Wed Apr 11 2:00pm: Assignment 7 is marked, and should be in your mailboxes now. Solutions to assignments 5 and 6 are now up (I had forgotten about those).
  • Tue Apr 3 1:30pm: I think the final exam is done. I have to do some more editing before I send it off for printing, but it's basically done. There are 8 questions, seven of which are just like the ones in past exams, and the eighth is a quick-answer "how do you do this in SAS", where the answers to the various parts are a line or two of SAS code. The exam is a rather scary 36 pages long, but most of that is my SAS code and output. So be ready to do as much reading as writing on the exam!
  • Sat Mar 31 8:30pm: "assignment 8" is up, along with my solutions. You can ignore the "assignments" 9 and 10. The one data file that you need (for the time series question) is also up (on the assignments page, or you can just type in the whole url in the question). The rest of the data is included in the questions.

    If I forget, remind me that we're doing course evaluations in class this week. They're the old-fashioned no. 2 pencil and paper ones.

  • Wed Mar 28 12:00pm: here's the overheads for today's class.
  • Thu Mar 22 9:30pm: The time series notes are up. They read a bit like a book; I'll try to edit them down for the lecture next week.
  • Thu Mar 22 9:00pm: The old final exams are up: 2011, which has SAS output incorporated in the question paper, so that it looks longer than it really is, and 2010, which has SAS output in a separate booklet. These exams were scheduled for 2 hours, which wasn't enough time, so I am giving you guys 3 hours to complete an exam of similar length. Bear in mind that we've covered different material in different years in this course, so there may be a question or two on the old exams that you won't be able to do. Ask me if in doubt.
  • Thu Mar 22 8:30pm: I am ever so near finishing marking assignment 5; that should be coming back to you tomorrow. There is an assignment 7, due next week, on principal components, factor analysis and nonparametrics. I will upload the extra data file that it needs in a few moments. Over the next couple of days, I'll dig out my time series notes and get them up here. I think I also have a couple of time series questions that can go on an assignment. (I'm also aware that I have a couple other as yet unfilfilled promises: to re-mark any assignments on which you can't see my comments, and to put up some old exams so that you'll know what to expect. I haven't forgotten about those, and will get to them.)
  • Tue Mar 20 10:30am: I came to the conclusion that it would be sensible to do two more "hairy" things and two more straightforward things. So the plan as it stands now is to do Confirmatory Factor Analysis first this week (hairy #1) followed some stuff on nonparametric statistics. Next week we can do either spatial statistics or time series (either of those would be hairy #2) with frequency tables in the last class (that's a nice one to finish with). Let me know if you have any preferences for spatial statistics or time series.
  • Sat Mar 17 4:00pm: I decided that assignment 6 is weighty enough as it stands, so I will leave it as is. I have three questions on principal components and factor analysis which will go on assignment 7, possibly along with something from the remaining part of factor analysis.

    I have new software for annotating PDFs. If you can't read my comments on assignments 1-4, let me know and I will re-mark them.

  • Wed Mar 14 1:30pm: Assignment 5 is due today (or in the next day or so). I've decided to delay assignment 6 until next week, but I am planning to add a couple of questions to it based on our class today.

    Talking of which, today's class is on principal components and exploratory factor analysis. We have only a few things left in my notes (and I think 3 lectures left after this one):

    • confirmatory factor analysis
    • spatial statistics
    • log-linear frequency analysis
    • (possibly) time series
    • and anything else you wish to request (instructor's familiarity permitting)
  • Mon Mar 5 2:00pm: This week I'd like to look at cluster analysis and multidimensional scaling. There might also be time for principal components. After that could come a number of things, as you prefer: I have notes on factor analysis, spatial statistics, time series and frequency tables. Let me know if any of those would be helpful to you, or if you'd like me to look at something else.
  • Mon Mar 5 2:00pm: Assignment 4 is marked, and should be in your mailbox now. Let me know if you have any problems reading it.
  • Wed Feb 29 7:30pm: The updated exercise-diet example (with today's corrections) is up.
  • Tue Feb 28 8:00pm: this week's class is about discriminant analysis, which addresses the question of what (in terms of some measured quantitative variables) makes groups (defined by categorical variables) different. This makes the technique a natural followup to MANOVA, but it is of interest in its own right also.

    There is also an Assignment 5 for which I will endeavour to make sure that the data files are in the right places. (If we have time, we might make a start on Cluster Analysis as well, but I'm not counting on it.)

  • Tue Feb 28 8:00pm: Some extra notes for tomorrow's class have been piggybacked onto the end of the exercise and diet example. If you don't see some stuff about Discriminant Analysis, download it again so that you have the most recent version.
  • Wed Feb 15 5:00pm: Here is the extra example from class today.
  • Tue Feb 14 8:00pm: HTML output from SAS is possible, and looks kind of flash. If you want to experiment, you can do this: at the top of your .sas file, below the "options" line (if you have one) but above everything else, put these two lines:
            ods html;
            ods graphics on;
        
    and then your data step and proc step(s) as usual. When you run SAS (in the usual way), you'll find the usual .log and .lst files, but also a file sashtml.htm, which is your output in HTML format. At the command line, type firefox sashtml.htm & (and wait for what seems like an eternity) to see it. I believe you'll be able to select the bits of the output you want and copy and paste them into a Word document (I was able to do this into OpenOffice all right). Since the HTML output includes the graphs that would otherwise pop up in their own windows, you are spared the necessity of taking a screenshot or likewise devious business.

    There's no obligation on you to get this working in this course. I offer it merely as something you might like to try. If simple copying from the .lst file works for you, it works for me.

    I believe the HTML file contains the output from the current run of SAS only, so make sure to copy it before it disappears! I would expect that once you have Firefox running on Mathlab, you can just refresh the window (control-R also works for this) to see the latest output.

  • Tue Feb 14 7:30pm: I screwed up with this week's assignment, specifically question 2. The question as given is actually correct, but the link on the assignments page to the data set is actually to the wrong file! The file asurv.dat (link) contains the data I intended you to use for the question, with the treatments labelled trt1, trt2, trt3. The file asurv2.dat (link), which is the one that got linked to from the assignments page, has four numerical columns, with the first two representing the treatment, coded: 1 and 0 means treatment 1, 0 and 1 representing treatment 2, and 0 and 0 representing treatment 3. The following data step will make things behave properly:
        data surv;
          infile 'asurv2.dat';
          input a b survtime cens;
          trt=cat(a,"-",b);
        
    What this does is to create a treatment variable that is three different things, 1-0 for treatment 1, 0-1 for treatment 2, and 0-0 for treatment 3. I don't mind which data file you use, though asurv.dat ought to work as advertised.

    In case you care, the version of SAS (9.1) that was installed on Mathlab last year couldn't handle a categorical variable (by means of a "class" statement) in proc phreg. Later versions of SAS (9.2, which I have, and 9.3, which Mathlab now has) handle it just fine. I wrote this question based on what 9.2 could do, and had to change it so my students could do it! Hence the two versions of the data file. I rewrote the question but didn't change the link on the assignments page. Oops. That link is now fixed.

  • Tue Feb 14 10:30am: This week's class is intended to cover multivariate analysis of variance and (one method of attacking) repeated measures ANOVA. I've put up an Assignment 4 that covers both of those. (I think we are now caught up on assignments.) With the reading week, you'll have two weeks to get Assignment 4 done.

    I'm hoping to find another example for repeated measures ANOVA before class. I'm not wild keen on the example in the notes (at least, not as the only example). I'll try to get some notes up before class so that you can print them out (if that is your wish).

  • Mon Feb 13 1:00pm: Assignment 2 is marked, and the marked version should be in your mailboxes now. Some very good work, and generally I was impressed with what you did.
  • Fri Feb 10 2:00pm: I'm almost finished marking the assignment 2's, but since I head out soon to go pick up my daughter, I may not be able to finish the job until Monday. My solutions to Assignment 2 are going up in a moment.
  • Tue Feb 7 3:00pm: I'm now organized for tomorrow's class. The material for the lecture is a review of analysis of variance and multiple-comparison methods, the analysis of covariance, and (depending on time) the beginning of multivariate analysis of variance.

    Assignment 3 is up, due next week. The week after that is reading week, so you'll get two weeks to do assignment 4. My solutions to assignment 2 will go up once I have all the assignments handed in that I'm going to get.

  • Mon Feb 6 9:30pm: I've finished marking the assignment 1's, which should be by now in your e-mailboxes. Good work overall, especially after the delays getting everything working. I have a couple of general comments to share on Wednesday.

    Let me know if you have any problems opening or reading the marked assignments, which are in .pdf format, readable with Acrobat Reader or similar.

    My solutions to this assignment are up. Get to them from the assignments page.

  • Tue Jan 31 1:00pm: I haven't heard much from you guys, so I presume that things are working all right SAS-wise.

    Tomorrow's class will include the remainder of the stuff on logistic regression (probably the first hour) and some of the stuff on survival analysis (the second).

    There will also be a second assignment, which contains a little regression, and three problems on logistic regression, including one on the stuff that will occupy us in the first hour tomorrow. (I think the assignments are slowly catching us up. My intention was to have each "piece" of subject matter be one week's class, but that isn't working out yet.) In the unlikely event that you want something else to do after you finish assignment 1, the first two or three questions on assignment 2 are doable now.

  • Fri Jan 27 12:00pm: New Mathlab is working, and SAS is working on it! Two reminders:
    • Use your UTorID and password to log in.
    • The editor to use to write your code and look at your output is called gedit (not kwrite). (This program lives on the Mathlab machine; you don't need to worry about having a copy of it on your machine.)
    Any problems that you have now in accessing the machine can be addressed to Tianze Sun (tsun at utsc); any problems in making SAS behave can be addressed to me!

    I think we'll be good in having Assignment 1 due next Wednesday (by e-mail to me, to save paper). If you need an extension, let me know. Next week's assignment might have to be a bigger one, to get us caught up, but you'll be up to speed on SAS by then, won't you?

  • Wed Jan 25 11:00am: We may have progress on the Mathlab front, but not the SAS front yet. My instructions for you are these: for now, we can use mathlab-old. This is only a temporary measure since this machine will disappear at the end of Feb. If you have a UTSCid, you can use it to log into that machine; if you don't, send an e-mail to Tianze Sun (tsun at UTSC) and he will fix you up with a temporary account until we can get SAS running on the new Mathlab machine.

    Sorry about the delays in getting you SAS access. Last night, I was even seeing whether I could give you access to my machine, at least while it's plugged in at home. But no luck there.

  • Mon Jan 23 7:00pm: We have progress on the Mathlab front. As of about now you (all) should have access to the new Mathlab machine mathlab.utsc.utoronto.ca using your UTORid and password. Note that this applies whether you have a UTSCid or not.

    I am not sure whether SAS will be working when you log in or not. You can check in two parts: (a) see whether you can log in at all (whether your UTORid and password get you onto the machine at all), and (b) if you can log in, just type "sas" (with no quotes) at the command line. What should happen is that about 8 blue windows will pop up, one of which looks like a text editor. Let me know of any success or failure and any error messages you receive.

  • Mon Jan 23 12:00pm: The deadline for Assignment 1 is extended by one week to Wed Feb 1. Some people (maybe all of you) are still unable to access SAS, and I want to give you a fair chance to work through any problems. I've just e-mailed the tech guy again to see if at least I can get everyone accounts that will work on mathlab-old. There will be an Assignment 2 based on this week's class, but I will probably delay the due date for that to Feb 8. I don't like delaying things like this: for one thing, it's better for you to be working on things just after you've seen them in class. But we have to make do with what we have.
  • Wed Jan 18 5:00pm: From the horse's mouth - the old Mathlab machine was being retired, but the tech guy is having trouble getting SAS to install on the new machine. However, the old Mathlab machine is still accessible at mathlab-old.utsc.utoronto.ca. I just checked and SAS is running just fine there. So see whether you can log into mathlab-old using your UTSCid and password (or maybe even your UTORid and password). That is to say, follow all the instructions, except for whenever you see mathlab replace it with mathlab-old.
  • Tue Jan 17 6:00pm: Mathlab is not a happy bunny right now (as in: *I* cannot log into it). I'll let you know when things are back up to speed, and then we can worry about getting everyone logged in.
  • Tue Jan 17 12:30pm: Some people have been having trouble accessing Mathlab. As soon as I can raise the person responsible for accounts on Mathlab, I'll see if I can get that fixed.

    Just so you know, there will be an Assignment 1 this week (due next week). It is up on the website, though it may not make much sense until after tomorrow's class. If the problems accessing Mathlab persist, I'm happy to allow extensions as needed.

  • Tue Jan 10 9:00pm: first class is tomorrow, Wed, in IC 328 from 2:00-4:00pm.

     
     
     
  This Web Page is maintained by Ken Butler
Last modified: Feb 14, 2012
© 2008-2012 University of Toronto at Scarborough. All rights reserved.