6. 手写代码

6.1 基本算法

6.1.1 冒泡排序

/**
 * 冒泡排序 时间复杂度 O(n^2) 空间复杂度O(1)
 */
public class BubbleSort {

   public static void bubbleSort(int[] data) {

      System.out.println("开始排序");
      int arrayLength = data.length;

      for (int i = 0; i < arrayLength - 1; i++) {

         boolean flag = false;

         for (int j = 0; j < arrayLength - 1 - i; j++) {
            if(data[j] > data[j + 1]){
               int temp = data[j + 1];
               data[j + 1] = data[j];
               data[j] = temp;
               flag = true;
            }
         }

         System.out.println(java.util.Arrays.toString(data));

         if (!flag)
            break;
      }
   }

   public static void main(String[] args) {

      int[] data = { 9, -16, 21, 23, -30, -49, 21, 30, 30 };

      System.out.println("排序之前:\n" + java.util.Arrays.toString(data));

      bubbleSort(data);

      System.out.println("排序之后:\n" + java.util.Arrays.toString(data));
   }
}

6.1.2 二分查找

图4-二分查找核心思路

实现代码:

/**
 * 二分查找 时间复杂度O(log2n);空间复杂度O(1)
 */
 
def binarySearch(arr:Array[Int],left:Int,right:Int,findVal:Int): Int={
  if(left>right){//递归退出条件,找不到,返回-1
    -1
  }

  val midIndex = (left+right)/2

  if (findVal < arr(midIndex)){//向左递归查找
    binarySearch(arr,left,midIndex-1,findVal)
  }else if(findVal > arr(midIndex)){//向右递归查找
    binarySearch(arr,midIndex+1,right,findVal)
  }else{//查找到,返回下标
    midIndex
  }
}

拓展需求:当一个有序数组中,有多个相同的数值时,如何将所有的数值都查找到。

代码实现如下:

/*
  {1,8, 10, 89, 1000, 1000,1234} 当一个有序数组中,有多个相同的数值时,如何将所有的数值都查找到,比如这里的 1000.
  //分析
  1. 返回的结果是一个可变数组 ArrayBuffer
  2. 在找到结果时,向左边扫描,向右边扫描 [条件]
  3. 找到结果后,就加入到ArrayBuffer
   */
  def binarySearch2(arr: Array[Int], l: Int, r: Int,
                    findVal: Int): ArrayBuffer[Int] = {

    //找不到条件?
    if (l > r) {
      return ArrayBuffer()
    }

    val midIndex = (l + r) / 2
    val midVal = arr(midIndex)
    if (midVal > findVal) {
      //向左进行递归查找
      binarySearch2(arr, l, midIndex - 1, findVal)
    } else if (midVal < findVal) { //向右进行递归查找
      binarySearch2(arr, midIndex + 1, r, findVal)
    } else {
      println("midIndex=" + midIndex)
      //定义一个可变数组
      val resArr = ArrayBuffer[Int]()
      //向左边扫描
      var temp = midIndex - 1
      breakable {
        while (true) {
          if (temp < 0 || arr(temp) != findVal) {
            break()
          }
          if (arr(temp) == findVal) {
            resArr.append(temp)
          }
          temp -= 1
        }
      }
      //将中间这个索引加入
      resArr.append(midIndex)
      //向右边扫描
      temp = midIndex + 1
      breakable {
        while (true) {
          if (temp > arr.length - 1 || arr(temp) != findVal) {
            break()
          }
          if (arr(temp) == findVal) {
            resArr.append(temp)
          }
          temp += 1
        }
      }
      return resArr
    }

6.1.3 快排

图1-快速排序核心思想

代码实现:

/**

 * 快排

 * 时间复杂度:平均时间复杂度为O(nlogn)

 * 空间复杂度:O(logn),因为递归栈空间的使用问题

 */

def quickSort(list: List[Int]): List[Int] = list match {

    case Nil => Nil

    case List() => List()

    case head :: tail =>

      val (left, right) = tail.partition(_ < head)

      quickSort(left) ::: head :: quickSort(right)

  }

6.1.4 归并

图2-归并排序核心思想

核心思想:不断的将大的数组分成两个小数组,直到不能拆分为止,即形成了单个值。此时使用合并的排序思想对已经有序的数组进行合并,合并为一个大的数据,不断重复此过程,直到最终所有数据合并到一个数组为止。

图3-归并排序“治”流程

代码实现:

/**
 * 快排 
 * 时间复杂度:O(nlogn)
 * 空间复杂度:O(n)
 */
def merge(left: List[Int], right: List[Int]): List[Int] = (left, right) match {
    case (Nil, _) => right
    case (_, Nil) => left
    case (x :: xTail, y :: yTail) =>
      if (x <= y) x :: merge(xTail, right)
      else y :: merge(left, yTail)
  }

6.1.5 二叉树之Scala实现

1)二叉树概念

2)二叉树的特点

(1)树执行查找、删除、插入的时间复杂度都是O(logN)

(2)遍历二叉树的方法包括前序、中序、后序

(3)非平衡树指的是根的左右两边的子节点的数量不一致

(4)在非空二叉树中,第i层的结点总数不超过 , i>=1;

(5)深度为h的二叉树最多有个结点(h>=1),最少有h个结点;

(6)对于任意一棵二叉树,如果其叶结点数为N0,而度数为2的结点总数为N2,则N0=N2+1;

3) 二叉树的Scala代码实现

定义节点以及前序、中序、后序遍历

class TreeNode(treeNo:Int){

  val no = treeNo
  var left:TreeNode = null
  var right:TreeNode = null

  //后序遍历
  def postOrder():Unit={
    //向左递归输出左子树
    if(this.left != null){
      this.left.postOrder
    }
    //向右递归输出右子树
    if (this.right != null) {
      this.right.postOrder
    }

    //输出当前节点值
    printf("节点信息 no=%d \n",no)
  }

  //中序遍历
  def infixOrder():Unit={
    //向左递归输出左子树
    if(this.left != null){
      this.left.infixOrder()
    }

    //输出当前节点值
    printf("节点信息 no=%d \n",no)

    //向右递归输出右子树
    if (this.right != null) {
      this.right.infixOrder()
    }
  }

  //前序遍历
  def preOrder():Unit={
    //输出当前节点值
    printf("节点信息 no=%d \n",no)

    //向左递归输出左子树
    if(this.left != null){
      this.left.postOrder()
    }

    //向右递归输出右子树
    if (this.right != null) {
      this.right.preOrder()
    }
  }

  //后序遍历查找
  def postOrderSearch(no:Int): TreeNode = {
    //向左递归输出左子树
    var resNode:TreeNode = null
    if (this.left != null) {
      resNode = this.left.postOrderSearch(no)
    }
    if (resNode != null) {
      return resNode
    }
    if (this.right != null) {
      resNode = this.right.postOrderSearch(no)
    }
    if (resNode != null) {
      return resNode
    }
    println("ttt~~")
    if (this.no == no) {
      return this
    }
    resNode
  }

  //中序遍历查找
  def infixOrderSearch(no:Int): TreeNode = {


    var resNode : TreeNode = null
    //先向左递归查找
    if (this.left != null) {
      resNode = this.left.infixOrderSearch(no)
    }
    if (resNode != null) {
      return resNode
    }
    println("yyy~~")
    if (no == this.no) {
      return this
    }
    //向右递归查找
    if (this.right != null) {
      resNode = this.right.infixOrderSearch(no)
    }
    return resNode

  }

  //前序查找
  def preOrderSearch(no:Int): TreeNode = {
    if (no == this.no) {
      return this
    }
    //向左递归查找
    var resNode : TreeNode = null
    if (this.left != null) {
      resNode = this.left.preOrderSearch(no)
    }
    if (resNode != null){
      return  resNode
    }
    //向右边递归查找
    if (this.right != null) {
      resNode = this.right.preOrderSearch(no)
    }

    return resNode
  }

  //删除节点
  //删除节点规则
  //1如果删除的节点是叶子节点,则删除该节点
  //2如果删除的节点是非叶子节点,则删除该子树

  def delNode(no:Int): Unit = {
    //首先比较当前节点的左子节点是否为要删除的节点
    if (this.left != null && this.left.no == no) {
      this.left = null
      return
    }
    //比较当前节点的右子节点是否为要删除的节点
    if (this.right != null && this.right.no == no) {
      this.right = null
      return
    }
    //向左递归删除
    if (this.left != null) {
      this.left.delNode(no)
    }
    //向右递归删除
    if (this.right != null) {
      this.right.delNode(no)
    }
  }
}

定义二叉树,前序、中序、后序遍历,前序、中序、后序查找,删除节点

class BinaryTree{
  var root:TreeNode = null

  //后序遍历
  def postOrder(): Unit = {
    if (root != null){
      root.postOrder()
    }else {
      println("当前二叉树为空,不能遍历")
}
}
    //中序遍历
    def infixOrder(): Unit = {
      if (root != null){
        root.infixOrder()
      }else {
        println("当前二叉树为空,不能遍历")
      }
    }
    //前序遍历
    def preOrder(): Unit = {
      if (root != null){
        root.preOrder()
      }else {
        println("当前二叉树为空,不能遍历")
      }
    }

    //后序遍历查找
    def postOrderSearch(no:Int): TreeNode = {
      if (root != null) {
        root.postOrderSearch(no)
      }else{
        null
      }
    }

    //中序遍历查找
    def infixOrderSeacher(no:Int): TreeNode = {
      if (root != null) {
        return root.infixOrderSearch(no)
      }else {
        return null
      }
    }

    //前序查找
    def preOrderSearch(no:Int): TreeNode = {

      if (root != null) {
        return root.preOrderSearch(no)
      }else{
        //println("当前二叉树为空,不能查找")
        return null
      }
    }
//删除节点
    def delNode(no:Int): Unit = {
      if (root != null) {
        //先处理一下root是不是要删除的
        if (root.no == no){
          root = null
        }else {
          root.delNode(no)
        }
      }
    
  }

6.2 开发代码

6.2.1 手写Spark-WordCount

val conf: SparkConf = 
new SparkConf().setMaster("local[*]").setAppName("WordCount")

val sc = new SparkContext(conf)

sc.textFile("/input")
  .flatMap(_.split(" "))
  .map((_, 1))
  .reduceByKey(_ + _)
  .saveAsTextFile("/output")

sc.stop()

6.3 手写HQL

6.3.1 手写HQL 第1题

表结构:uid,subject_id,score

求:找出所有科目成绩都大于某一学科平均成绩的学生

数据集如下

1001	01	90
1001	02	90
1001	03	90
1002	01	85
1002	02	85
1002	03	70
1003	01	70
1003	02	70
1003	03	85

1)建表语句

create table score(
    uid string,
    subject_id string,
    score int)
row format delimited fields terminated by '\t'; 

2)求出每个学科平均成绩

select
    uid,
    score,
    avg(score) over(partition by subject_id) avg_score
from
    score;t1

3)根据是否大于平均成绩记录flag,大于则记为0否则记为1

select
    uid,
    score,
    avg(score) over(partition by subject_id) avg_score
from
    score;t1

4)根据学生id进行分组统计flag的和,和为0则是所有学科都大于平均成绩

select
    uid
from
    t2
group by
    uid
having
    sum(flag)=0;

5)最终SQL

select
    uid
from
    (select
    uid,
    if(score>avg_score,0,1) flag
from
    (select
    uid,
    score,
    avg(score) over(partition by subject_id) avg_score
from
    score)t1)t2
group by
    uid
having
    sum(flag)=0;

6.3.2 手写HQL 第2题

我们有如下的用户访问数据

userIdvisitDatevisitCount
u012017/1/215
u022017/1/236
u032017/1/228
u042017/1/203
u012017/1/236
u012017/2/218
U022017/1/236
U012017/2/224

要求使用SQL统计出每个用户的累积访问次数,如下表所示:

用户id月份小计累积
u012017-011111
u012017-021223
u022017-011212
u032017-0188
u042017-0133

数据集

u01     2017/1/21       5
u02     2017/1/23       6
u03     2017/1/22       8
u04     2017/1/20       3
u01     2017/1/23       6
u01     2017/2/21       8
u02     2017/1/23       6
u01     2017/2/22       4

1)创建表

create table action
(userId string,
visitDate string,
visitCount int) 
row format delimited fields terminated by "\t";

2)修改数据格式

select
     userId,
     date_format(regexp_replace(visitDate,'/','-'),'yyyy-MM') mn,
     visitCount
from
     action;t1

3)计算每人单月访问量

select
    userId,
    mn,
    sum(visitCount) mn_count
from
    t1
group by 
userId,mn;t2

4)按月累计访问量

select
    userId,
    mn,
    mn_count,
    sum(mn_count) over(partition by userId order by mn)
from t2;

5)最终SQL

select
    userId,
    mn,
    mn_count,
    sum(mn_count) over(partition by userId order by mn)
from 
(   select
        userId,
        mn,
        sum(visitCount) mn_count
    from
         (select
             userId,
             date_format(regexp_replace(visitDate,'/','-'),'yyyy-MM') mn,
             visitCount
         from
             action)t1
group by userId,mn)t2;

6.3.3 手写HQL 第3题

有50W个京东店铺,每个顾客访客访问任何一个店铺的任何一个商品时都会产生一条访问日志,访问日志存储的表名为Visit,访客的用户id为user_id,被访问的店铺名称为shop,请统计:

1)每个店铺的UV(访客数)

2)每个店铺访问次数top3的访客信息。输出店铺名称、访客id、访问次数

数据集

u1	a
u2	b
u1	b
u1	a
u3	c
u4	b
u1	a
u2	c
u5	b
u4	b
u6	c
u2	c
u1	b
u2	a
u2	a
u3	a
u5	a
u5	a
u5	a

1)建表

create table visit(user_id string,shop string) row format delimited fields terminated by '\t';

2)每个店铺的UV(访客数)

select shop,count(distinct user_id) from visit group by shop;

3)每个店铺访问次数top3的访客信息。输出店铺名称、访客id、访问次数

(1)查询每个店铺被每个用户访问次数

select shop,user_id,count(*) ct
from visit
group by shop,user_id;t1

(2)计算每个店铺被用户访问次数排名

select shop,user_id,ct,rank() over(partition by shop order by ct) rk
from t1;t2

(3)取每个店铺排名前3的

select shop,user_id,ct
from t2
where rk<=3;

(4)最终SQL

select 
shop,
user_id,
ct
from
(select 
shop,
user_id,
ct,
rank() over(partition by shop order by ct) rk
from 
(select 
shop,
user_id,
count(*) ct
from visit
group by 
shop,
user_id)t1
)t2
where rk<=3;

6.3.4 手写HQL 第4题

已知一个表STG.ORDER,有如下字段:Date,Order_id,User_id,amount。请给出sql进行统计:数据样例:2017-01-01,10029028,1000003251,33.57。

1)给出 2017年每个月的订单数、用户数、总成交金额。

2)给出2017年11月的新客数(指在11月才有第一笔订单)

建表

create table order_tab(dt string,order_id string,user_id string,amount decimal(10,2)) row format delimited fields terminated by '\t';

1)给出 2017年每个月的订单数、用户数、总成交金额。

select
   date_format(dt,'yyyy-MM'),
   count(order_id),
   count(distinct user_id),
   sum(amount)
from
   order_tab
where
   date_format(dt,'yyyy')='2017'
group by
   date_format(dt,'yyyy-MM');

2)给出2017年11月的新客数(指在11月才有第一笔订单)

select
   count(user_id)
from
   order_tab
group by
   user_id
having
   date_format(min(dt),'yyyy-MM')='2017-11';

6.3.5 手写HQL 第5题

有日志如下,请写出代码求得所有用户和活跃用户的总数及平均年龄。(活跃用户指连续两天都有访问记录的用户)日期 用户 年龄

数据集

2019-02-11,test_1,23
2019-02-11,test_2,19
2019-02-11,test_3,39
2019-02-11,test_1,23
2019-02-11,test_3,39
2019-02-11,test_1,23
2019-02-12,test_2,19
2019-02-13,test_1,23
2019-02-15,test_2,19
2019-02-16,test_2,19

1)建表

create table user_age(dt string,user_id string,age int)row format delimited fields terminated by ',';

2)按照日期以及用户分组,按照日期排序并给出排名

select
   dt,
   user_id,
   min(age) age,
   rank() over(partition by user_id order by dt) rk
from
   user_age
group by
   dt,user_id;t1

3)计算日期及排名的差值

select
   user_id,
   age,
   date_sub(dt,rk) flag
from
   t1;t2

4)过滤出差值大于等于2的,即为连续两天活跃的用户

select
   user_id,
   min(age) age
from
   t2
group by
   user_id,flag
having
   count(*)>=2;t3

5)对数据进行去重处理(一个用户可以在两个不同的时间点连续登录),例如:a用户在1月10号1月11号以及1月20号和1月21号4天登录。

select
   user_id,
   min(age) age
from
   t3
group by
   user_id;t4

6)计算活跃用户(两天连续有访问)的人数以及平均年龄

select
   count(*) ct,
   cast(sum(age)/count(*) as decimal(10,2))
from t4;

7)对全量数据集进行按照用户去重

select
   user_id,
   min(age) age 
from
   user_age 
group by 
   user_id;t5

8)计算所有用户的数量以及平均年龄

select
   count(*) user_count,
   cast((sum(age)/count(*)) as decimal(10,1)) 
from 
   t5;

9)将第5步以及第7步两个数据集进行union all操作

select
   0 user_total_count,
   0 user_total_avg_age,
   count(*) twice_count,
   cast(sum(age)/count(*) as decimal(10,2)) twice_count_avg_age
from 
(
   select
   user_id,
   min(age) age
from
   (select
   user_id,
   min(age) age
from
   (
    select
   user_id,
   age,
   date_sub(dt,rk) flag
from
   (
    select
       dt,
       user_id,
       min(age) age,
       rank() over(partition by user_id order by dt) rk
    from
       user_age
    group by
       dt,user_id
    )t1
    )t2
group by
   user_id,flag
having
   count(*)>=2)t3
group by
   user_id 
)t4

union all

select
   count(*) user_total_count,
   cast((sum(age)/count(*)) as decimal(10,1)),
   0 twice_count,
   0 twice_count_avg_age
from 
   (
      select
         user_id,
         min(age) age 
      from 
         user_age 
      group by 
         user_id
   )t5;t6

10)求和并拼接为最终SQL

select

    sum(user_total_count),

    sum(user_total_avg_age),

    sum(twice_count),

    sum(twice_count_avg_age)

from

(select

   0 user_total_count,

   0 user_total_avg_age,

   count(*) twice_count,

   cast(sum(age)/count(*) as decimal(10,2)) twice_count_avg_age

from

(

   select

   user_id,

   min(age) age

from

   (select

   user_id,

   min(age) age

from

   (

    select

   user_id,

   age,

   date_sub(dt,rk) flag

from

   (

    select

       dt,

       user_id,

       min(age) age,

       rank() over(partition by user_id order by dt) rk

    from

       user_age

    group by

       dt,user_id

    )t1

    )t2

group by

   user_id,flag

having

   count(*)>=2)t3

group by

   user_id

)t4

union all

select

   count(*) user_total_count,

   cast((sum(age)/count(*)) as decimal(10,1)),

   0 twice_count,

   0 twice_count_avg_age

from

   (

      select

         user_id,

         min(age) age

      from

         user_age

      group by

         user_id

   )t5)t6;

6.3.6 手写HQL 第6题

请用sql写出所有用户中在今年10月份第一次购买商品的金额,表ordertable字段(购买用户:userid,金额:money,购买时间:paymenttime(格式:2017-10-01),订单id:orderid)

1)建表

create table ordertable(
    userid string,
    money int,
    paymenttime string,
    orderid string)
row format delimited fields terminated by '\t';

2)查询出

select
   userid,
   min(paymenttime) paymenttime
from
   ordertable
where
   date_format(paymenttime,'yyyy-MM')='2017-10'
group by
   userid;t1

select
   t1.userid,
   t1.paymenttime,
   od.money
from
   t1
join
   ordertable od
on
   t1.userid=od.userid
   and
   t1.paymenttime=od.paymenttime;

select
   t1.userid,
   t1.paymenttime,
   od.money
from
   (select
   userid,
   min(paymenttime) paymenttime
from
   ordertable
where
   date_format(paymenttime,'yyyy-MM')='2017-10'
group by
   userid)t1
join
   ordertable od
on
   t1.userid=od.userid
   and
   t1.paymenttime=od.paymenttime;

6.3.7 手写HQL 第7题

有一个线上服务器访问日志格式如下(用sql答题)

时间                    接口                         ip地址

2016-11-09 11:22:05    /api/user/login                  110.23.5.33

2016-11-09 11:23:10    /api/user/detail                  57.3.2.16

.....

2016-11-09 23:59:40    /api/user/login                  200.6.5.166

求11月9号下午14点(14-15点),访问api/user/login接口的top10的ip地址

数据集

2016-11-09 14:22:05	/api/user/login	110.23.5.33
2016-11-09 11:23:10	/api/user/detail	57.3.2.16
2016-11-09 14:59:40	/api/user/login	200.6.5.166
2016-11-09 14:22:05	/api/user/login	110.23.5.34
2016-11-09 14:22:05	/api/user/login	110.23.5.34
2016-11-09 14:22:05	/api/user/login	110.23.5.34
2016-11-09 11:23:10	/api/user/detail	57.3.2.16
2016-11-09 23:59:40	/api/user/login	200.6.5.166
2016-11-09 14:22:05	/api/user/login	110.23.5.34
2016-11-09 11:23:10	/api/user/detail	57.3.2.16
2016-11-09 23:59:40	/api/user/login	200.6.5.166
2016-11-09 14:22:05	/api/user/login	110.23.5.35
2016-11-09 14:23:10	/api/user/detail	57.3.2.16
2016-11-09 23:59:40	/api/user/login	200.6.5.166
2016-11-09 14:59:40	/api/user/login	200.6.5.166
2016-11-09 14:59:40	/api/user/login	200.6.5.166

1)建表

create table ip(
    time string,
    interface string,
    ip string)
row format delimited fields terminated by '\t';

2)最终SQL

select
    ip,
    interface,
    count(*) ct
from
    ip
where
    date_format(time,'yyyy-MM-dd HH')>='2016-11-09 14'
    and 
    date_format(time,'yyyy-MM-dd HH')<='2016-11-09 15'
    and
    interface='/api/user/login'
group by
ip,interface
order by
    ct desc
limit 2;t1

6.3.8 手写SQL 第8题

有一个账号表如下,请写出SQL语句,查询各自区组的money排名前十的账号(分组取前10)

1)建表(MySQL)

CREATE TABLE `account`
(   `dist_id` int(11)DEFAULT NULL COMMENT '区组id',
    `account` varchar(100)DEFAULT NULL COMMENT '账号',
    `gold` int(11)DEFAULT 0 COMMENT '金币');

2)最终SQL

select
    *
from
    account as a
where
    (select
        count(distinct(a1.gold))
    from
        account as a1 
    where
        a1.dist_id=a.dist_id
        and
        a1.gold>a.gold)<3;

6.3.9 手写HQL 第9题

1)有三张表分别为会员表(member)销售表(sale)退货表(regoods)

(1)会员表有字段memberid(会员id,主键)credits(积分);

(2)销售表有字段memberid(会员id,外键)购买金额(MNAccount);

(3)退货表中有字段memberid(会员id,外键)退货金额(RMNAccount)。

2)业务说明

(1)销售表中的销售记录可以是会员购买,也可以是非会员购买。(即销售表中的memberid可以为空);

(2)销售表中的一个会员可以有多条购买记录;

(3)退货表中的退货记录可以是会员,也可是非会员;

(4)一个会员可以有一条或多条退货记录。

查询需求:分组查出销售表中所有会员购买金额,同时分组查出退货表中所有会员的退货金额,把会员id相同的购买金额-退款金额得到的结果更新到表会员表中对应会员的积分字段(credits)

数据集

sale
1001    50.3
1002    56.5
1003    235
1001    23.6
1005    56.2
        25.6
        33.5

regoods
1001    20.1
1002    23.6
1001    10.1
        23.5
        10.2
1005    0.8

1)建表

create table member(memberid string,credits double) row format delimited fields terminated by '\t';

create table sale(memberid string,MNAccount double) row format delimited fields terminated by '\t';

create table regoods(memberid string,RMNAccount double) row format delimited fields terminated by '\t';

2)最终SQL

insert into table member
select
    t1.memberid,
    MNAccount-RMNAccount
from
    (select
        memberid,
        sum(MNAccount) MNAccount
    from
        sale
    where
        memberid!=''
    group by
        memberid
    )t1
join
    (select
        memberid,
        sum(RMNAccount) RMNAccount
    from
        regoods
    where
        memberid!=''
    group by
        memberid
    )t2
on
    t1.memberid=t2.memberid;

6.3.10 手写HQL 第10题

1.用一条SQL语句查询出每门课都大于80分的学生姓名

name   kecheng   fenshu

张三    语文    81

张三    数学    75

李四    语文    76

李四    数学     90

王五    语文    81

王五    数学    100

王五    英语    90

A: select distinct name from table where name not in (select distinct name from table where fenshu<=80)

B:select name from table group by name having min(fenshu)>80

2. 学生表 如下:
自动编号   学号  姓名 课程编号 课程名称 分数
1     2005001 张三   0001   数学   69
2     2005002 李四   0001   数学   89
3     2005001 张三   0001   数学   69
删除除了自动编号不同, 其他都相同的学生冗余信息

A: delete tablename where 自动编号 not in(select min(自动编号) from tablename group by学号, 姓名, 课程编号, 课程名称, 分数)

3.一个叫team的表,里面只有一个字段name,一共有4条纪录,分别是a,b,c,d,对应四个球队,现在四个球队进行比赛,用一条sql语句显示所有可能的比赛组合.

答:select a.name, b.name
from team a, team b
where a.name < b.name

4.面试题:怎么把这样一个
year   month amount
1991   1     1.1
1991   2     1.2
1991   3     1.3
1991   4     1.4
1992   1     2.1
1992   2     2.2
1992   3     2.3
1992   4     2.4
查成这样一个结果
year m1  m2  m3   m4
1991 1.1 1.2 1.3 1.4
1992 2.1 2.2 2.3 2.4 

答案
select year, 
(select amount from aaa m where month=1 and m.year=aaa.year) as m1,
(select amount from aaa m where month=2 and m.year=aaa.year) as m2,
(select amount from aaa m where month=3 and m.year=aaa.year) as m3,
(select amount from  aaa m where month=4 and m.year=aaa.year) as m4
from aaa group by year

*********************************************************************5.说明:复制表(只复制结构,源表名:a新表名:b) 

SQL: select * into b from a where 1<>1 (where1=1,拷贝表结构和数据内容)
ORACLE:create table b

As

Select * from a where 1=2

[<>(不等于)(SQL Server Compact)

比较两个表达式。 当使用此运算符比较非空表达式时,如果左操作数不等于右操作数,则结果为 TRUE。 否则,结果为 FALSE。]

6.

原表:
courseid coursename score
-------------------------------------
1 java 70
2 oracle 90
3 xml 40
4 jsp 30
5 servlet 80
-------------------------------------
为了便于阅读,查询此表后的结果显式如下(及格分数为60):
courseid coursename score mark
---------------------------------------------------
1 java 70 pass
2 oracle 90 pass
3 xml 40 fail
4 jsp 30 fail
5 servlet 80 pass
---------------------------------------------------
写出此查询语句
select courseid, coursename ,score ,if(score>=60, "pass","fail")  as mark from course

7.表名:购物信息

购物人      商品名称     数量

A            甲          2

B            乙          4

C            丙          1

A            丁          2

B            丙          5

……

给出所有购入商品为两种或两种以上的购物人记录

答:select * from 购物信息 where 购物人 in (select 购物人 from 购物信息 group by 购物人 having count(*) >= 2);

8.

info 表

date result

2005-05-09 win

2005-05-09 lose

2005-05-09 lose

2005-05-09 lose

2005-05-10 win

2005-05-10 lose

2005-05-10 lose

如果要生成下列结果, 该如何写sql语句?

         win lose

2005-05-09  2   2

2005-05-10  1   2

答案:

(1) select date, sum(case when result = "win" then 1 else 0 end) as "win", sum(case when result = "lose" then 1 else 0 end) as "lose" from info group by date;

(2) select a.date, a.result as win, b.result as lose

  from

  (select date, count(result) as result from info where result = "win" group by date) as a

  join

  (select date, count(result) as result from info where result = "lose" group by date) as b

on a.date = b.date;